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Related Applications 

The preset app.ica.ion is a continuation-in-part appUcation CCP") of Parent 

V S receiving office on January 28, 2000, and this application Cam* the benefit o 
2.JSUSC 5 ,,9(«)ofU.S: Provisional AppUcationNos. 60/165,124, and 6MCWW. 

US Lnationa. Application Seria. No: PCT/US00/02246 Cairns «he benefit of prronty 

. - , a i v^on Serial No 60/1 18,206, filed February 1, 1999, U.S. 
29 1999, U.S. Provisional Application Senal No. ou/ii ^ „„..«, 

Provisional AppUcation Sena. No. 60/126,593, fileo March 26, ,999, U.S. Prov^na. 
Marions S«ia. No. 60/134,093, filed May ,4. ,999. ano US. ProvUiona, AppUcanon 
sl No. 60/134,092, filed Mav ,4, ,999. Eacb of *e aforementioned apphcanons rs 
explicit* incorporated herein by reference in their entirety and for a., purposes. 

TECHNICAL FIELD 
This invention generaUy relates to genetics^ microbiology. The invention 

pro vides nove. methods ro identify tire function of and re,ationships between nuc.erc ac,d and 
The metirod is pa^crtiarlyusefiti for finding me identifymg genes and 

p^des having potentia. therapeutic re.evance in organisms, e.g., microorganisms, such 

UZ— ^-^^"^^^TTd^a^las 
genes and po,ypeptides found by tirese methods, Ttese genes and po.ypephdes are usefir, as 

potential drug targets. 

BACKGROUND 

The determination of me fimcdons of and re.ationships betweon nucleic acid 

identity with genes and proteins of Known fimction or, in tire absence of.~ve 
homology laborious experimental work. The availabiliry of many complete genome 

0 ^function, Several methods have been deve.oped which can predrct the genera, 
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. lv,i™ their functional relationships rather than sequence 
-ctionofprote^^tii-fon ^ relatfid when ti^ form 

New metnoos , leic ^ protein sequences. 

^' ^^^ re ^ Lpo-ee o f tire search 

SUMMARY 

Tfce invention provides novel methods for cnarurfenzing *e function of 
n „c,eic acid or a po.ypeptide sequence lhar may he ^ „. essential 

, forme.owmorviahm^an^^^ ^ Unships herwcn averse 
the invention comprising algonthms that can. V chara: teri2ation of nucleic 

acid and protein sequences can he the basis fo characttriza tion can 

intOT ct with those nucleic acids and polypeptides. F°r=xamp 

basis for the design of novel drugs, paruv, 

from a pathogen. identifying a nucleic acid or a 

The invention provides a method for identirymg 

30 ^ Ji:-^.^*^-— w 
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„wntide seouence that is known to be a drug target; (b) 
oroviding a first nucleic ac.d or a polypeptide sequence .,.«,„;„„- 
„■ - least one algorimm selected from the group consisting of a ••domaui fuston 
P Td a Itgldl profile" method and a "physiologic linkage" memod, wherein the 

SCand, (0 -^*«^-- taP ^^3LT"" 
,o a plurality of sequences using a. .east one of die algorilbms as se. ford, » slep <b) ,o 
idenuiy a second sequence dial has a functiona. rdationship lo die first sequence, diereby 
identifying a nucleic acid or a polypeptide sequence d^ may be a target for a drug. 

The invention provides a method for identifying a nucle.c acid or a 
po.ypeptide sequence ma, may be essential for me growth or viability of an organism 
jSingmefoHowingsteps: (a) pmviding a firs, nucleic acid or a P*«M~ 
Zs knoln ,o be essentia! for me growth or viability of an organism; (b) providing a, leas, 
one algorimm capable analyzing a functional relationship between nucleic add or 
peptide sequences selected fiom me group consisting of a "domam fusion mediod, a 

L uucieic acid or the polypeptide sequence ,o a plurality of sequences usmg a, teas, one of 
TalgoHmms as - form in step (b) ,o identify a second sequence «ha, has a fimcfcona. 
Unship ,o die firs, sequence, mereby identifying a nucieic add or a polypept.de 
sequence ma, may be essentia, for me growd, or viability of an organism. 

In on. aspec, of me medrods of me invention, me drug is an an—ma. 
tog . to another aspee, me firs, nucleic acid or a polypeptide sequence is derived fiom a 
parogen. The patiiogen can be a microorganism, such as Mycotacuriu* nUrculos* 

The plurality of sequences used ,o identity a second sequence can comprise a 
oatebase of me gene sequences of an entire genome of an organism. The plurality of 

used I identify a second sequence can comprise a datebase of,he gene sequences 

derived from a pathogen. 

in one aspec, of die mediods of me invention, me "phylogeneuc profile 
memod algoridmi comprises (a) obtaining date, comprising a Us, of pro,eins fiom a, teas, two 
genomes; (b) comparing me lis, of proteins ,0 form a protein phylogenetic profile for each 
protein, wherein m. protein phylogenetic mofi.e indicates me presence or absence of a 
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Ocular prottin family in each of <h. at least two genomes based on 
proKinbelongmgtoapameularp^ of pro ,eins based on similar profiles, 

bomo.ogyof.be proteins; and (.) . toctional relationship . Tbe 

^ogeneuc profile g a probabUtty (p) va!«e mresho.d. Tbe 

Homo.ogy between me p^ «P ^ ^ ^ on fire torn, nmnber of sequence 
probability can be set wrti. „ „ oumber of pIot eins in me fir,, 

-P^matarembe^^ ^ prMence or absence of a protein 

S ee ^ c ^rlt famUy in eacb of me at leas, two genomes can be 
belonging ,o a particular prorein rami -otattonaiy distance can be 

calcma.db.C^lgnmgb.ose.n^m^P ^.^^ 

ev o.ution probability process by *^ ^ ^ ^ oonanrcmd 

wh ere an and aa- a« any amino acros s, — 1 ^> J ^ ^ Mnditiooal 

probabiltty man*; (c) accounbng for -*^J£ ties foI ^ ^.d pair 

probability mabix by taking me produc, of me condm P 

during me alignment of me two sequences, repressed by Pipy U * 

„ n '=n a raa-^aa'\ maximizing 

■ 

^ m a conditional probabilUy man* can be represent by : 

BLOSUM62ij 
Pfl(i-/)=P(/)2 A 2 
^re BLOSUM62 is an amino acid substitution matrix, and P(i-»j) is the 

rmrzzrr^ — . 
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to alternative aspects of the methods of the invention, the "physiologic 
^ method algorithm identifies proteins and nncleic acids tha, participate in a common 
2-1 Pamway; identifies proteins and nucleic acids toa, paxticipa* - the synfi.es s of a 
Hon slctuml complex; and, identifies proteins and nuc.eic acids toa, palpate tn a 

common metabolic pathway. 

to one aspect of the invention, fit. "domain fusion" method algontom 
comprises (^aligrnngat^prim^^oacidseou^ofmmfiplec^^r, 
homologous po.ypep.ides to second primary amino acid sequence of apluraUry of protems, 
and, (h for Z alignmen, found be^ the firs, primary amino acid fences o f afi of 
1 multiple distinct non-homo.ogous po.ypeptides and at leas, one proven of ti™>d 
penary amino acid seau.no*, outoutting an indication identifymg tite ahgned *cond 
ZZ -uno acid sequence as an indication of a functional tin, between tire ahgned fir* 
ZZ* po.ypep.ide sequent. The aUgning can be performed by an a.gontom select 

u ^ v , . . j _ pot rtasT algorithm. The multiple distinct 

BLAST algorithm, a FASTA algorithm, and a PS1-BLA5> i aigo 

gen ome database. The ptorafity of proteins oan have a .mown function. At lea* one ftoe 
Ltip.e distinct non-homo.o g ous po.ypeptides can have a .mown ^ ^ ~ 
toe mutop.e distinct non-homo.ogous polypeptides can have an unknown function. The 
^gLen' can be based on me degreo of homo.ogy of «he multi P .e disfinc, non-homo.ogous 
polities to toe ptoratoy of proteins. The "domain fusion" mefitod can enmpnse 
^lining the significance of the aligned and identified second primary ammo ~,d 
iuence by computing a probability « vaiue toteshold. The probabUity tomshoto « b« 
et wifi. reject to toe vatoe 1/NM, based on toe tola, number of sequence enmpansons toat 
2 Z be performed, whereto „ is toe number of proteins to a firs, organism's genome and M 

to^onal links between one firs, primary ammo acid sequence of mutop,e dtstinc, non- 
homologous polypeptides and an excessive number of otoet distinct non-homo.ogous 
Jypeptides for^ty alignmen, found between toe firs, primary amino acid sequences of toe 
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« U Known to be a drug «~ ' selected from the 

tocD ona> relationship between nuclmc acrd o p m «hod and a 

^ * " "dtTtion provides a compel program product, smred on a computer- 

^^^iji^i^ientifj^Tg a nucleic acid or a polypeptide sequence that may be 
^We medrum, for ^ ^ program product 

essential for the grow* or vrabdrty of an org f providm g a feat 

uprising insuuetions for causing a ^ fcr tue growd. o, 

20 viability of an organrsm; (b ™^ sel ec,ed 60m tine group 

re ,ationship between nuclerc acrd o, polyp ? method md a "physiologrc 

consisting of a "domain ms.on" metirod, a phyloge P ^uence ,0 a 

« ° f ^ m ZZ^* telationship to tire firs, sequence and generating an 

^"^^Idesacomputersys.n.comprismgKa, a processor-, and, 
a computer program product of me invention. 
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AU pub.—, paten*, patent applications, GenBaak sequences and ATCC 

„ hereby exnressly incorporated by reference for all purposes, 
deposits, cited berern are hereby express* * „ m m fonh m ^ 

The derails of one or more embodunenrs or ure 

JA .wrintion below. Omer features, objects, and advantages 
accompanying drawings and tire desenption below 
5 of tire invention will be apparent ftom the desenption and drawngs, 

DESCRIPTION OF DRAWINGS 

Figure 1 is an example of functional tinkages predicted berween InhA (Rv 

14M> ^ ^"^an — ^ *~ "T ^ 

10 3795), wbrcb is aTget of tire drug edrambutol, and o*er TB genes using tire pbylogen«rc 

"^Figure 3 is an example of predicted fcnetiona, linkages between five TB genes 

18 ^ metaboUsm ^ $ ^ ^ ^ rf ^ ^ m3C) ^ other 

TB genes. 

Like tefetence symbols in tire various dtawings indicate like eiements. 

DETAILED DESCRIPTION 

me present invention provides novel metirods for ideating 

invention identify nove. genes and polypeptides on tire ba»s of 
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^ drug ^ ^ pK)vide5 mcthods tor ide « the fUnc.cn of 

genes and polypeptides fiom Mycobacterium tuberculosis (MTB or TB). Bawd on this new 

genes ana v^yy v ., vg to screen for 

fanctional determination, these genes and polypeptides can be us 
functional oei ohv siology and growth of Mycobacterium 

compositions capable of modifying the physiotogy S» ^on, 
i • rra^ Thus eenes and polypeptides identified oy tne mem 

» 8CnCS kn ° Wn ^^LJdwhhTB pangenesis, survival or th* . 
— ^"SS^ pamwaysarepotentia.drug.arge*. TB genes and 
toportan. or umqne to TB biochemi p targets. The 

po.vpephdes tha, have no homologies ,d £ hum P ^ ^ or 
function of many of the TB genes and polypeptides idennn 

poiypeptides with which they are functionally linked. tavention 
TB genes whose function was idennfied using the me 

. „ •» . dm. « e they can act as *o»Afe drug targets) provides proof 

, th oones and polypeptides that are drug targets, runner w 
.dentify TB genes and poiypep ^ ^ ^ 

8 
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• , • U S provisional application serial no. 60/165,086. Portal TB drug targets 

arerdenttnenbyse Through the use of these methods, TB genes and 

appUca uo» sen , 00^60 ^ ^ ^ ^ ^ ^ ^ , ^ 

^ are mus»a«d on Tables 3 and 4, respectively (see below) 

The phrase "functional link," "functional relaKd" and grammanca. 

~ (« ». mite dehydrogenase and fumarase, which are both present ur the TCA 

» each other, they are tinked by virtue of their participation ur the same 

SS- P— * Odrer examples of linked or re,«ed polypeptides are v^wo 

, tiHes are oart of a protein complex, physically interact, or act upon each another, 
prudes areP-of^ ^ ^ ^ ^ ^ ^ ^ 

^ j^nrifies oroteins that are separate in one organism but joined 

w. .enomes and analyzes the pattern of inheritance of each protein across the Afferent 

8 Z ^ tt^nTve sinular patterns of inheritance, either ac,uired or .os. as a par, 
organisms. Protems that haves Y H „ n »llv linked The gene proximity method 

of a group of proteins through evolution, are funcfonally lmked. meg P 
, MenL genes drat remain physicatiy ciose or "clustered" throughou, evo.utton and are 

he to identify a TB gene or polypepdde functional* Unked to a known drug targe, AnU-TB 
drugs inc.ude isoniazid, rifampicin, cthambutol, stteptomycin, pyrazxmam.de and 
0 "lazone. Por isoniazid, tins drug is believed to ac, through enoy.-acy. reduce fnhA, 
resulting in mycotic acid biosynthesis inhibition. Thus, TB genes or poiypepttdes 
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taction Hnked ,o enoy.-acyl reductese .nbA are potential drug targets; «j ' • 
TTtal - analysis of InhA, fte targe, for isoniazid, fte most wrdely used anu- 

" L ^ wa^ared processes aod lipid and polyketide 

Txlp.es of fte identification of several TB genes and po.ypepddes ft- are functional 

related to the targe, of <hese anti-TB drugs is shown in Figures 1 to 5. 

"Domain Fusion" or "Rosette Stone" Method 

The "domain fusion" or "Rosette Stone" method compares protem sequences 
acoss known nucleic acid datebases (e.g., known genomes) «o identify genes and proton* 

to another organism. In such cases, me two separate proteins often cany ou, related or 

known function of the other component. In addition, merely identirymg un** 
^ing me meftod described herein provides valuable mfonnadon (e.g., use^ as 
r^ge, for an antibacteria. drug), regardless of whether ft. function of one or more of ft. 
prX used ,o fomr fte tink(s) is known. Because the «wo -mponenfs do no, have s.rm ar 
^acid sequence the function of one could no, be inferred from tire other on the bas.s of 

^«^ fOTi ^ gd ^ targets(e ,, ra<tal gtergete)described 
herein (eg., fte "Rosette Stene Methods are based on fte idea that proteins fta, participate 
^common centra! complex, meteboUc paftway, biological process or wrft close y 
related physiological functions, are functionally linkeri. In addition, these methods also are 

, proteins in one organism can often be found fused info a singfe pofypepude Cam un a 
Afferent organism. Similarly, fused proteins in one organism can be found as —1 
pretoins in ofter organisms. For example, in a firs, organism one mtgh. tdenufy two vm 
Lked proteins "A" and «B" wift unknown function. In anofter organism, one mayfind 

-..w "a" and a par, fta, resembles "B". Protem 

single protein "AB with a part that resembles A ana a pan 

, 0 AB allows one to predict that "A" and "B" are functionally related. 
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The (taction activity of each distinct protein in the «Rose«a Stone" method 
need no, be known prior to performing the method (I .... the function of A, B, or AB need not 
be known). Using the "Rosetia Stone" method to compare and analyze several unknown 
protein sequences can provide information regarding relationships of each proton, absent 
knowledge about the functional activity of the initially analyzed proteins themselves. For 
example, me information (i.e., the .inks) can provide information that me proteins are par, of 
a common pathway, function in a related process or physicaUy interact Such tnformauon 
need no. be baseti on the biological function of the individual proteins. 

These methods can provide information regarding links between prevrously 
„n-tinked proteins mat function, for example, in a concert process. A marker for example, 
for a particular disease sfate is identified by tt>e presence or absence of a profetn ie.g 

j.,„« nn -> Iinks«e information) identified by the method, 
Her2/nen in breast cancer detecoon). Links ft.*., imo , 

which link proteins "B" and "C" to such a marker suggest that proteins Ti and C « 
re.ated by function, pbysica. interaction or par, of a common biological 
marker Such information is useful in designing screening metitods and tdent.fy.ng drug 
targe* (e.g., TB drug targets), making diagnostics, and designing <herapeut.es. 

to one approach, tire "Rosetta S.one" method is performed by sequence 
eomparison ma, searches for incomplete Wangle relationships" between, for example, three 
proteins, for two pro<eins A- and B' ma, are differen, fiom one another bu, smnlar .n 
sequence to anotiter protein AB. Completing tite triangle relationship provtdes nsefid 
information regarding tire proteins' biologica. function©, functional in.emct.on, pathway 
relationships or physical relationships with outer proteins in dr. "mangle." 

Either nucleotide sequences or amino acid sequences can be used tn tire 
mentodsforidentifymgfnnctionallyrelatedorlinkedgenesorpolypeptides. Wherea 

nucleic sequence is «o be used i, can be firs, uans.a,ed fiom a nucleic acid sequence ,o ammo 
acid sequence. Such translation may be performed in al. names if the coding sequence „ no, 
known. Programs that can nansla.e a nucleic acid sequence are known in me art ^In 
addition, for simplicity, me description of tins method discusses me use of a pan" of 
proteins in me determination of a "Rosena Stone" protein, more man 2 may be used (e.g., 3 
4 5, ,0, 100 or more proteins). Accordingly, one can analyze chains of tinked Pto.ems.-ch 
al «A» linked by a Rosetta Stone protein ,o "B" linked by a Rosetia Stone pro,e.n to C , «c. 
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^ f OU nd and their function 
By ^ method, groups of functionally «— * foUD 

A method can start ^ identifying *< *~> »*~ "* * d 
either a nucleic acid sequence and/or a deduced afg then used to search a sequence 

Washington University, St L»« M ° ^ ~ ^ domain of proteins and 

protein families, see, e.g., Corpet (1W examined for its 

ability to act as a "Unset* Stone" Patent (r e a I P 
fences or domains from both protein A^dprote^ B> 

m «hods include, for example, B1 f^" ^ U:330 -331; and infra), and 

4,0), BLITZ (MPsrch) (see, ^ C ^ usA 85(8):2444 . 2448; and infra). 
FASTA (see, e.g., Pearson (198 ) F-a N^A ^ ^ ^ ^ 

, Theprobese,uencecanbeanylength(e.g.,about 3 u 

Where the probe sequences are used tndrvrdually to s of 
. — ^^^^-JSSSSS.-^ .nthisway, 
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.„ ,_- nt wnen the linked proteins show no 
Typically, "Rosetta Stone'Minked protetns are only kept when m 
tVl - r r p & hetero-dimers, trimers, etc.). 

, n lacking any functional information that is suspected of having two or more domams 

ILT*. pnmary amino acid of «he fusion p-m is de,ermrned an 
Leuce. This probe sequence is used «o search a sequence dambaa. eg., GenBank, PFAM 
7ZL» Every pnKein in me sequence darabase is examined for homology ro the 
Sli(U.-M.^— ' spoiypepddesequencesordoma.ns 
potential rus v different methods of performing such 

^-^"^7^^ .BUBO— 
r~jSZ£Z anttwsrchpro^-performscom^ofp^.em 

d enned as being "finked" so iong as a, ieas, one proiein per domain confining ma, domarn 

Uq own functional characteristics, 
^^^^^^^^^^sigmfic^ofpossib^hes. 

Tbe s-atistica! significance of an alignment score is described by me probability. P of 

a when me sequences are shuffled. One »ay to compute a, va,ue 

Lhold is to firs, consider me ~1 number of sequence compansons ma, are m be 
, "1 For example, if there are N pro,cns in £ . coil and Af in al, ^ 
' llrU^xM. If a comparison of mis number of random sequence would resul, m one 
^myieldaPvalueoflWA/bychancemismenissetasmemreshold. 
^ This memod provides informadon regarding which pro<eins are funcbonaUy 

rel a,ed (eg., re.a«d bioiogica, funefions common stmcmral compiles, metabolic pamways 
„ or biological process) a subset of which physical* interne, in an orgamsm. 
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/o ♦v, r i OR n Adv. Appl. Math. 2.482) or r<c«u 
Waterman (Smith (1981) AOv.app howeV er, other, faster 

procedures such as BLAST, l- am ^ . Wflnheused 

, ;„ the art ( see infra discussion), can be usea. 
families), or others known in the art (see inn* 

Filtering Methods v,^ ftt least two pieces of information. First 

TheRosettaStoneMethodprovxdesatleasttwop 

second we of . few domains „ Unked «o an excessive 

^ Z ZZZ Stone" For exan.pl*. «H of the domains 

0 ou mber of otier domams by a Rosctla St P ^ Stc Homology 

arc linked to fewer man 25 other domams. However, som hundred olh . r 

S<P H)— -AT^ — ««« ^J^otvins^of 

— ^rr.i:rrz "Lai,, 

domains (,.e..«he domams Unked «o mor domain . base d analysis, bn. after 

25 ^^"f^r^l^Ld ImproveopredicnonofftmConaHy 
filtering only 749 links were rdenofied. Tms m F to are a 

number of ways to filter *e results to ^ „ increased Ugh er 
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reduces the chance number of "Reset* Stone" proteins thereby increasing the significance of 
agonal Un^^^ 

effect on functional prediction, as paralogs usually have very similar function, but -U affect 
ZZm of Paction of protein-protein interactions. This estate is calculated for 
each linked protein pair, and can be estimated roughly as: 

Fractional Error = 1 — — . 

and B', and the linking proteins is AB as above). 

TOe^rcanalsobeestunatedasl-r.whereristhemeanpereentof 

Led by a Rosens S.one prrtein, .here are n proreins with the fir* domarn but no, me 
second, and m proteins wilt the second domain bu, no, the first Tire p^een, of mre 

T is rlrefore estimate* as me smalier of » « « divided by „ urnes ~ 
T 1 be ca.cuia.ed for each se, of linked domains, « cau describe me confidence m any 

regi ons or repeal common amino acid sequences being repea.ed.y identified in a mosetm 
S L- protein by a plurality of distinct non-homo,ogous po.ypeptide, To reduce tins error 
fcepercen, of identic berw«n me -Ros^ Smne" »d me distinct non-homo ogo^s 
peptide can be measured. AUgnmen. percent of abou, 50% .o about 90 /., or, 
leTtively, about 75%, between me "Roseua S«one» a*d me distinc. po.ypeptrde are 
indicative of links that are no, subject to me small pq»tide sequence. 

Phylogenetic Pathway Method 

The "phylogenetic profile" method compares protein sequences across all 
known genomes and ana>yzes me panern of inheriunce of each protein across me different 
, organisms, m i«s simples, form, each pro.em is simply characterized by its presence or 

absence in each orgasm. For example, if mere are 16 known genomes men each pro^t 
may be assigned a .6-bi, code or phylogenetic profile. Since proteins tha, funcon together 
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rait of a larger functional or structural complex) 
(*». - <he same metabolic the same or similar patterns of inheritance, 

evolve in a correlated fashion, they ^ ot<me prot ein may be 

therapeutics. implemented in abinary code (*.*., 

describing the presence or absence gx ^ ^ 

gioupingof similar pmtem profiles mayte sUnaarit y can be modified 

depending upon particular h ^ ^ bits bemg 

example, criteria re,uin»g thatthe degn* to l5 bit s of the 16 bits would 

.dentical can be set, bur may be method s ran be used to 

indicate relatedness of the ptotem profiles as we.L ^ 
determine how similar two patterns must b. m or er 

* Pbyl ° geMn ; PTO " Clmod of phyiogenefie profile grouping 
vW, bacterial, archaea! or eulcaryouc Them * ^ method 

Ptovidesdtep^f— --orlharac^proteinsb^upon 
string with N entries, each one bit, where Neon p 100 to 100 0 or 
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• ~ m Proteins are clustered according to the similarity of their 
found the entry ^ . „ M patte m of inheritance, and by 

pjteins are likely to be similar to characterized 

in order to decide whether a genome contains a protein related to another 

• a is aliened with each of the proteins from 

pabular protein, the query annuo acd sequence altgne 

— rfrt in ouestion using known alignment algorithm (see above), looetermm 

Heseqlcesareshumedisdeseribed. One^to^s va.ne^-0 
L consider .he totai number o f sequence —sons tha, are ^£ ta 
^ins in a firs, organism's genome and M in aU other genomes fins number ,s Nx M. U 
Z llr were compared ,0 random s^uences i, wotdd he expected that one pair won* 
^dapvalueof _L_. This va>ue ean he set as a tnresho.d. Ofite, fineshoids may he used 

and will be recognized by those of skill in the art. 

A non-binary phylogenetic profile can be used. In fins method, the 
„ , .HcnrofileisasningofWennieswheremen-ennyrepresentsfiteevolunonary 

phylogenetic profile is a stnng ,v.„« 0 .n ome To define an evolutionary 

distance of the query protein to the homologm then genome. To 

.,,„„„, hetween two sequences is performed. !>ucn 

" descHhed above). The evoluuon is represented by a Markov P™* 
lluon rates, over a fixed intetva, of fime, given by a conditional proband,* 

p(aa — ♦ 0 

where on and on • are any amino acids. One way to construe, such a matrix is .o 
5 convert the BLOSUM62 amino acid substitutions matrix (or any other ammo acid 
5 rlfionm^^.PAMlOO.PAM.^fioma.ogoddsman.xtoacondiUona, 

probabUity (or transition) matrix: 

p BL OSUM62ij „ 

«l- „ is the probability that amino acid , will be replaced by arnmo acid;*rough 
30 r v J J * , me aDun dances of amino 

point mutations according to the BLOSUM62 scores. The p, s are the ah 
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conditions that: 

-*/) = ! • (2) 

i 

Tb. probability of .his process is computed .o account for the observed 
abgumeut by taking the product of the conditional probabilities for each aUgned parr-. 

A family of evolutionary models is then tested by taking powers of the 

two sequences. Porexample, one migbtsimp.y count member ofpositmnsm the 
„ «h« the two proteins have adapted different amino acids. 

^^rr^rrsi:«^prc 0 .d. 

^.r—lofLologouapro^. Puncdon. pro^ could .en be 
clLered or grouped by maching similar trees, rather man vectors or mamce s 

JLr ro predict funcdon, different proteins are grouped or closured 
accorfingtomesirmlariryofmeirphylogenedcprome, Siiudar profiles indicate a 

— ^--^-^"^^tST^U Tbesimplesris 
Grouping or clustering may be accomplished in many ways. 

« *. Euclidean distance between two profiles. Another method is to compute a 

Typically a genome database will oe uscu as a 
^ormadon.Wh^megenom.daUb^c.nuir.omymenuc.eicacidse^^ 

,0 seq uence is elated to an amino acid sequence * *~ * *—0 - - * rf 
be feasible but will likely be more difficult due to me degeneracy of the geneuc code. 
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He of translating a nucleic acid sequence are known in the art or easily 
Programs capable of translating seauen ce for each amino acid. 

, . tWp of ski ll in the art to recognize a codon sequence ior cacn 

ha, share some degree of homology. Sueh a companson can be done 

algorithms. 

.. «t™ c turallv- or MetabolicaHy- Linked" Method 

, . • L me aenes thar encode them, thar participate in a common funcrronal 
" Ision), « participate in me syndesis of me same or a 

pautway (e*. eel. motUtty or eel d, J samc or similaI m «abolic 

shnilar structural complex ^ "^Tl * »«— ^ «- ™> 

partway (e.g., glycols, hp* ^ linked „ proteiffi . Having a 

functional pathway groups are examples of •funct.onauyimK p 

.Lionel "goal" they evolve in a correlated fashion. Thus, "homologs tn 
SSJi-i— T— * Whilemesedetectionmet^a^ 

. . ^ widely genetically disparate organisms. 

.tnholic nathway that can be identified by sequence identity searching, 

^AST^gnmemE-valne < ,0-> «. polypeptides identified in axnown patirway. For 
(BLAST ^ against £ coU prot eins ; MTB p^tns 

3 eXamPle ; ^lr ( U having high sunilarUv by BLAST alignment) ac, adjacentiy in 
whose E. coli homologs (i.e„ navmg onsu Nucleic Acids 

m «abolic pathways as defined in the EcoCyc da«abase (see, e.g., Karp (1998) Nuc.etc 

Res 26-50-53) were identified. 

to another example, nageHar proteins are found in baceria ma, posse. 
30 flag eUa bur no, in other organism, Accordingly, if - proteins have homologs m the same 
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ofthe nation use to concept to systematical* map links between aU the proteins coded 

^ ' Ben ° me ' Typicauy, functionally «— P— ~ — " id 
wi* each other and,therefore. cannot be .inked by conventional sequence alignment 
Hone, According the me.hods ofthe invention identity drug targets that could nor be 
"Ined using convention*, sequence comparison (i.e., sequence homomgy or sequence 

^ >e used in function with the "domain fiasion" or "Rose«a Stone" method and adso can 
he filtered by other memods fe. predic, functional linked protetns, such - *^ 
phy.ogenetie profile memod or the analysis of chelated mRNA expressron panems. *~ 
luTtha. filling by mese rwo memoes for me Rosetia Srone prediction for S. cerev.su*. 

Tl» to be — y related as proteins that were observed to P« ^ 
experimenta. techniques like yeas. 2-hybrid methods or co.unmunoprec,p,^onmemods. 

For examp.e, a combination of these memods of predicdon can be used «o 
esmbUsh Hnks betwe*. proteins of c.ose.y related fimction. The memods of the mvention 
TL "*°-a S-e" method and me -phy.ogenetie profile" method) can be ~mbmed 

Eisen (1998) "Cluster analysis and dispUy of genome-wid. expressron parmers, Proc. ML 

Acad. Sex USA, 95:14863-14868. 

The various techniques, memods, and variations thereof described can be 
fomented in part or in who.e using computer-based sysrems and memods. AdditionaUy, 
Zmer-based sysfcms and memods can be used to augment or enhance me fimctionaU* 
Zribed above, increase the speed at which me fimctions can be performed, and prov.de 
ILna. featmes and aspec. as a par, of or in addiden to those of me invennon desenhed 
dsewher. in mis document Various computer-based systems, memods, and 
indentations in accordance with mis technology are described herem. 

Proteins linked to current drug targets 

The invention also provides a nove! memod for identifymg a polypeptide, or 

l. j a « !« tfcot u a tareet for a drug. The method analyzes the 
the nucleic acid sequence that encodes it, that is a target ior a orug 



BNSDOCID: <WO 0135317A1 



PCT/US00/31152 

WO 01735317 

„ wV i*»rein at least one of the sequences 

• , tercet of a drug or encodes a polypeptide drug target 

» a known target of a tog ^ ^ ^ tQ ^ 

» mefl ,od. or a combination thereof, between all "query genome genes. Next,«acn 
^ ote funeuonaUy linked to either a known drug target or » a sequence 
° f ^ P ~* f » a known drag urge, are examined These proteins 

Is tt bacteria, vimses and thelike. This method allows for the 

There are very few drugs that are effective for anti-mberculosis rherapv, si«e 

address this issue, the methods 01 tn of known drugs. 

, • /xvrronrTB^ proteins that are functionally linkea to xnciuig 

tuberculosis (MTB or l d) proiein* Ar{Tanism as the drug, since the 
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•i^ The anti-TB drugs included isoniazid, rifampicin, 

^^.z*****-** ~z^zzzzzz 

metabolic pathways to be linked to tatty j _ te 

.natal for bacterial replication in host lungs (see, e.g., Cox (1999) 

The inhA gene was also linked «o an operon encodurg two pinanve 

. „f entirely unknown function. The inhA gene was further linked 
oxidoreductases and a gene of entirely unkno aaciHus TO 6, ite 

to a second operon that includes pepR and gpsl. PepR is a proteas 

,o a second ope ^ ^ dianunopimelate, a 

homolog is ^™ ^ ,J b murE gene product and diaminopicolinate 
. Xne and is likely to * essentia, for MTB (see helow, Gps, ,s a putative 
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fn , to mutations of various types in a limited region of 
96% of resistant isolates were found to have muta 

u d aene , S ee e g Yang (1998) J. Antimicrob. Chemother. 42.621-628). 

found to another RNA polymerase subunit, rpoC, as well as to various tRNA synthases *id 
nbosomal proteins. However, no functional links to uncharacterized protems werefound. 

Ethambutol. This drug is effective against tuberculosis when used in 
combination with isoniarid. It is believed that the drug interacts with the EmbB protein, a 
probable arabinosyl-transferase, inhibiting the biosynthesis of arabinan, a component of cell- 
nvelope lipids. As with rifampin, the evidence for this interaction is indirect, smce 
mutations in the embB gene axe responsible for ethambutol resistance (see, e.g., Lay (1997) 
Antimicrob. Agents Chemother. 41 :2629-2633). 

The "gene proximity" method correctly clusters embB with embA (Rv3794). 
This cluster is linked to a set of mostly uncharacterized genes by the '^hylogenericprofile" 
r^seePigure^whichshowsananalysisof^ 

drug Ethambutol, and shows functional linkages to genes mostly of unknown function but 
with some indications of localization at the bacterial membrane. 

Two of the uncharacterized genes, Rvl706c and Rvl800, belong to the 
abundant PE/PPE family of proteins hypothesized to be a source of antigenic variation with 
the potential ability to interfere with immune responses by inhibiting antigen processmg (see, 

one'of the four copies of the mce operon. This operon consists of eight genes coding for 
integral membrane proteins and proteins that have N-terminal signal sequences or 
hydrophobic segments and are believed to be involved in pathogenicity (see e.g., Cole 
(1998 supra). Rv0528 codes for a hypothetical membrane protein and Rv2159c corresponds 
to the murF gene, which participates in the biosynthesis of peptidoglycan precursors. 

The majority of the "links," or functionally associated sequences, involved 
proteins associated with processes related to the bacterial cell wall (with the possible 
exception of atsA and the putative choline dehydrogenase Rvl279, whose relationship to 
^processes is not immediately obvious). The proteins of unknown function are therefore 
, also expected to play some role in these processes and are thus of interest as potential drug 
targets. 
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^ ™,a I u gKB *yb in din gW «h.I6Sr R NA m dinhiW B pro tei n 
thesis Reside to .his compound merges from mutations in .he correspond^ gene 

£. effec. *re P tomycin resiaance by — « - «*« — ~ ° f ,6S *~ 
e g Srecva«sa„( 1 996)Anti mi crob.A ge n B Chc m oti ie r.40: 1 024-10 2 6). 

8 Although suep.omycin doesn't direcdy targe. RpsL, the funConal Unks 

g enera,ed for titis pro.ein - examined, as any .arge. whose inMb^nwU, |«r 
Luo. bacteria. pro.ein syndesis is likely .o be an effective anngrowlh/ ant.-nucrob.al 

Til. pro^in synutesis.rela.ed proteins, including large ribosoma, subunr, protend 
U U In and LuTL. —a. subuni. proteins S4, S5, S7. S8, and Sll; CongaOon 
^^'lEf^ti.ech^^EL.c.pBand^^^appro^snbu^ 

clpC and clpX. 

Prnteins linked to cell-wall related proteins 

T,e invention a!so provides a novel metimd for identifying a nuc.«c ac.d or a 

polypeptide se q ue»ce in an org*rism ft* is linked ,o a cell-wall re,a«cd ^" ^^T 
L£L the functional relationship between a. .east two sequences, wberetn a. leas, oneof 
lessees is a cell-wal, rela.ed protein, or, .he se q ue„ce is a nucleic acr d sequent drat 
eTles a ceU-wal. related protem. The me,hod comptises identifying pro.e.ns, anddte 
_ Mcode fl.en,, 4a, are functionally linked to a cell-wall related proteur. The 

profile" medtod or tire "physiologic linkage" method, or a combmanon .hereof, as desenbed 
^ Approximate* eleven H «Wos* proteins are indicated by sequence 

of eeU elongation and cell wall metabolism (see, e.g„ Broome-Smrth (1985) Eur^ Bro 
4 " ) Using *e methods of.be invention, tire functional linkages found for these 

30 shows an analysis of five of the approximately eleven NTIB protems presum 

plilUn to reveal functional linkages to various potential operons constsung of genes 
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M in various .spec* of ceU - metabolism, including cel. shape deterrninatio n and 
^doglycan biosyn^ia, as w.U more than ten genes of unknown function, whtch we can 

now associate with cell wall metabolism. 

Three of the proteins (pbpA, pbpB, and ponAl) reside in conserved gene 

dusters, presumably operona. Outer genes in the clusKrs around pbpA and pbpB are also 
implicattd in cell wall metabolism. For example, pbpA resides next to rodA, a membmne- 
associated protein whose E. coli homolog defcrmines ceU shape and is required for enzymattc 
activity of penicillin binding proteins (see, e.g., Matsuzawa (1989) J. Bacteriol. 171 :558- 
560). Likewise, pbpB resides next to six peptidoglycan biosynthesis genes and the wo 
septum and cell wall formation proteins ftsW and ftsZ. 

Two additional gene clusters were linked to these penicillin binding protetns 
by either the "phylogenetic profile" or "Rosetta Stone" pattern methods of the invention. 
Oneclus^iscomposedofthepeptidoglycansynmeticproUinmurBandaputahve 

membrane protein of unknown function that the functional linkages suggest is mvolved m 
cel. wall metabolism. The second gene cluster contains four genes, three of whtch are 
predictedtoresideinthecellmembraneorenvelope. Therefore, me uncharacter^d genes 
in these clusters are likely to be involved in cell wall metabolism, closely related to the 
Junction ofmepenicUlin binding prontins and are therefor, promising drag target. 

Another gene linked to cell wall metabolism by the computationally-denved 
hnkag. methods of the invention is gcpE, see Figure 4, which shows ma, the uncharged 
gene gcpE, known to be essential for bacterial survival (see, e.g., Baker (1992) FEMS 
Microbiol. Lett. 73:175-180), is predicted to be involved in ceU wall metabolism through .« 
functional links to a putative membrane protein and two murein hydrolase genes, ylBl and 
MB2 involved in cell separation. The genes forming a putative operon with gcpE are 
proposed as potential drug targe*. The Actional linkages place gcpE in a conserved gene 
CusKr with two genes of unknown function, one of which encodes a membrane prote.n. 
However, me three genes show correlated inheritance with two homo.ogs of lyffl. «* col, 
gene involved in penicillin tolerance (see, e.g, Gustafson (1993) J. Bactenol. .75: 1203-1205) 
and recently shown to encode a murein hydrolase essential for cell separation (see, e.g., 
Garcia (1999) Mok Microbiol. 31:1275-1277). The uncharacterized proteins tarn thts 
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duster are therefore expected to participate in processes similar ,o GcpE and migh. -Wore 

be promising drug targets. 

Prnteins linked to potentiaUy novel pathways 

TneLention also provides a novel method for idenufymg a polypeptide or a 
nucieic acid .ha, encodes it, ma, is linked to potentially novel biochenticnl (e.g., biosyntitenc. 
^ pathway, T* ntemod analyzes the functional relationship between a, ^ two 
wherein a, least on. of the sequences is associated with a biochenncal pamway, 
ITTpamway in a microorganism that enables the pathogen to evade an nrnnune process. 
ZL«L comprises identifying proteins, and the genes that encode them, dm are 
vitally linked to tire pa*wuy,inked sequences, The functional linkage ,s de.erm.ned 
^Tg the "domain fusion" memod, the Oogenetic profile" method or <he «physro,og.c 
linkage" memod, or a combination .hereof, as described herem. 

For example, the htrA gene encodes for a pumtive hea, shock proteu. 

h „m„.ogous to HtiA from Stf— ■ « — ** ^i^T 

■ "p-alc proteins. Masons in Otis protein have been linked with reduce* vabtirty m 
' ge S ( S ee,e.g., J ohnson099.)Mo..Mlcrob i ol. 5 :40M07). Thus,,twns 

lided to Lestigate tire function of htiA. Using me memods of me mvennon, resuits 

« is predicted with very high reliability to function with me uncharactenzed gene 
. ZZtt^S, Ichlws *. involvement of htrA in a potential* novel pathway 

genesmostiyofunlmovmfnnction.suggesting.heexis.enceofanov.lpamway. The 

IX chL.erizeti pro.eins sugges. *. .he pathway rcl a,es .o — 
process such as signaling and/or transport The lack of eukaryotic homdogs for mo* of 
25 £ grae5 Hnhed to htrA, suggest, ma. pro.elns of mis pathway couid be promtsmg drug 

^ Through i«s phylogenetic profile, htiA is Unked to a group 

proteins, including a ptrtative Upid esterase (Rvl900c), an ABC transporter ^ 
^characterized protein Rvl216c, which has weak homology to the lammtn » * 
30 „ ^ suggesting ma. i. migh. be a membrane protein. From ti.s «^ 
occluded ma. htiA is part of a novel pathway ma, involves membrnne-assocated processes, 
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such as signaling and/or transport. Because «he majority of the proteins Unked to htrA have 
2 eukary^ homo.ogs. and given the importance of htrA in S o^-pafcogenests, 
fois pathway represents another potential source of nove. targets for ^-tubercu.os.s togs. 

Proteins linked to essential proteins 

The invention also provides a novel method for idennfymg a polypept.de, or 
foe nucleic acid sequence tha, encodes 1, that is linked to an essential protein (e g., a patent 
uecessaryformegro^ofanorganis.n.snchasahac^um). The tnetitod a^ea the 
f^Z relationship between a, leas, two sequences, wherein a, leas, one of the sequences 
is Unketi to an essentia, protein, or, the sequence is a nudeic acid sequence that rtsetf . a 
cssenUal or encodes a po,ypeptide Unked ,0 an essentia, protein. The * 
determined by using the "domain fusion" method, the "phylogenettc profile mefood or me 
physiologic linkage" method, or a combination thereof, as described herem. 

For example, the MIPS database (Munich Information Center for Proteur 
Sequences; MIPS provides access through its WWW server to a spectrum of generic 
abases, including PEDANT, MYGD, MATD, MEST. me Pm-tatemanona. Profotn 
Sequence Database, the protein fondly database PROTFAM, the MTTOP ^database^d me 
^gainst-a,, PASTA database; see, e.g., Mewes (.999) Nucleic Actus Res. 27^,8) 
contains a .is, of 734 genes that are essentia, for SaccHarontyces c*~ vrabtmy (see^ 
eg Mewes (.999) supra). A list of Mycobacterium tuberculosis genes orthologous to these 
essentia, genes was generated. Using the methods of me invention, 60 such genes were 
found. The products of titese genes have a high .ike.ihood of also bemg 
urberculosis bacterium and therefore could be promising therapeunc targets. Furth~ 
Ice me lis, of essentia, genes came fiom a eukaryote, there is a significant chanco that these 
genes would also be found in the human genome. 
AutomaticMethodtoIdentifyDrugTcrgetsfrom 

One aspect of the invention provides a computational method to identify 
potential drug targets among the proteins expressed by a genome. This aspect take, 
advantage of the functional linkages calculated between genes in a genome using *e 
m ethods described herein, as well as the detection of sequence homology and the knowledge 
3 of a set of lethal or "essential" genes in one or more orgamsms. 
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h *n of the eenes in the genome of an organism for which 
essential genes ^ ^ essentials is tt. yeas. S. 

• ~ ^ino a reverse sequence search (e.g., yeast vs. id; ou 
pennon tha, compr^ ^Tetric Lt-scoring sequence search. Incne 
loosing pa* cM genes were generated by 

exempt aspect, KH . M . ^ ^ ^ ^ of te 

finding all pairs of genes (TB^C,) wnere BLAST search of the 



BLAST E-value<= 1x10 . . ^ 

For example, a TB gene is an ortholog of ayeast gene if the ^ e 

ft* when veast is searched with the TB gene, and the TB gene is the 
best scoring sequence match when yeast is se 

— u aarVnan TB is searched with the yeast gene, w c u«m 
best scoring sequence match when TB is searc 

H^eenes a s*of query genome genes tha, are ordrologs of known es^nrta. 
, known essennal genes, a set otq «yg eKdesi tedth e set of "putative 

A»ntKwoenomewaschosen. These genes were uc* ^ 
g enes muie *t 8~» ^ queIy genome genes am 

essentials-. Forme purposes of me algo m ^ 

genome. These genes aci a* essential in the 

query orgamsm. Functional iuu^b 

— r^rr^asme^cma 

pmahve — „ ^ geMS „ likely to be involved in important 

members of ^ membeIS ^ ^ put a ti ve essentia*. Lastly, 

* P— » — * C of ge L ta prcdicte d essential pathways aU of those genes 

the method removes from the set oi genom v 
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« have sequence homology ro «*aryotic genes or proteins. Tie genes *a. remain after 
Ms filtering step are .he predicted drug targets for the query orgatnsm. 

La benchmark, oris metirod was applied to the M aferoflw genome. Of 
.heover 3900 genes inTB, 11 were identified as potential drug targets. Comparing this Its. 
5 of „ p.edicted targe, to me .ess titan .0 known drug anti-TB dmg targe*, one gene was a 
Iwndrugurgetandonewashnkedtoaknowndrugtarge, Accordtngly. me algonrhm of 
"ITvetL performed statistically significantly much hettet titan a random chotoe of genes. 
A rough estimate of sratistical significance suggests ma. one would expect to see 2of 10 

. ^^ofoccmringhyrandomchanceofB.Sx.O-, Therefore, mis em^emrf .he 
Lhod is an entire., computation* algorithm drawing on me demounted abrltiy of me 
8 eneral methods of the invention ,o predict functional linkages between genes and <» 
Ictive.y identify drug target bacteria. The effectiveness of tins mefcod totdennfy 
novel drug targe* was clearly demonstrared when me algorithm was appbed .o me M. 

this issue, using tine metitods of the invention, funetiona. links ro tite essentia. ^ were 
s^hed. Functional links were selected which either do no. have homo ogs m yea*, or me 
e^atic activity of titeir product are known to be absent in human celK Usmg ^ 
20 hX confidence d*a, funetiona. links for 23 of tire genes (indtcateti m bo.d m Table .) 
were found. 
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Eight of these were linked to 12 unique MTB genes tha. satisfied «he eriteria 

„f the invention's methods (Table 1). Exemplary findings inelude: 

of Ure mvenuo £nco de, the enzyme dihydropteroare synthase 

(DHPS^JmUemrsetofsu— 

^r«Uo,es,DHPSaetiv, t yisno,foundmhun OT eeUs(see,e.g.,Huov^(1995) 

Antimierob. Agents Chemother. 39:279-2890. 

(2) me product of me gene folK. a 7,8^hydro^-hydroxymemy.- 
p.erinpyrophosphokhrase.hasrecentiy^^^ 

♦ -i^rseeeg Stammers (1999) FEBS Lett. 456:49-53). 
— — ^^o.o^stro.g.y.m.edmmeeasen^ye^senepe^ 

it a very compelling candidate for drug design. 
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* T»hi e 1 that are functionally linked to genes without 
Table 2. Subset of genes from Table 1 that are iu 

yeast homologs. 



Comments 



stimulates DnaK ATPase activity 



RvlOlO Irv1009 
IrvIOII 



Rv2439c 



" similar to Rcoli hypothetical protein t ctH 

Possible lipoprotein, similar to various other MTB protems 

rviuij. similar to E.coli hypot hetical protein YcbH 

ikv2421c UoA y-glutamyl phosphate reductase 
rv2440c obg Obg GTP-binding protem 
Rv2441c UpmA 50S ribosomal protem U27 

Ussoac' UolP dihydropteroaBsyMtaseCDHPS) 

Zleoe* UlK 7.S^y d ro^hy^xymethylpterinpy ro phospho kU1 ase 
rv3607c folX may be involved in folate biosynthests 
folE GTPcyclohydrolasel 

ftsH inr mhrane protein, chaperone , 

folK 7,8HHhydro^-hydroxymethylptenn pyrophospholonase 
f olX may be involved in folate biosynthesis 
folP dihydrc r— ^vnthase (DHPS)_ 
^Genes widiout yeast homologs shown in boldface 
t DHPS activity is found in some eukaryotic ceUs but not m human cells 



lRv3609c 



Rv3609c 
RV3610C 
Rv3606c 
Rv3607c 
RV3608C* 



to summaiy , the methods of the invention allowed identified of tins 
cnbination of funetional linkages «o es*ntia> gene, This information, rogefcer wrth -he 
Z «o ho m olo g s for these genes, makes tins group of proteins pronnstng dmg 

a low likelihood of toxicity from the inhibition of a host eqnrva.ent 
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u • I ™ented in par. or in whole using computer-based systems and methods, 
herein can be implemented m part o or enhance the 

Aod itionaUy, computer-based systems and «-^- « wL, me functions can 
. . .;,;„ „ d algorithms described herein, increase the speed at wmcn to 

, r^S- described eisewhem in mis document- Various exemplary computer-based 

^pmsentedhen*. ^ ^ toclude a ^ memory, such as a random 
emorv (RAM) and can also include a secondary memory. Tne secondary memon- 
m^mp c a hard disk drive and/or a removabie sterag. drive, represent a 
" *" 'T^l^ _ arive, an optica! disk drive, arc. The removab.e storage 

^1^^^ — ^ medium. Removable medU 

can oe a mw* ^movable storage media can includes a 

In alternative embodiments, secondary memory may m 

--. ======== . 

system, bucnmeanav j Mrtrif t ae interface (such as the found 

Examp.es of such can include a program cartndge and cartridge tnterfac ( 
■ m « came devices), a movable memory chip (such as an EPROM, or PROM) and 
^^n?-.-"* smtagenm.andmter^mataU^son^and 
Z to be referred 6cm me temovable storage unit to the — ^ 

The computer system can also include a communicattons tnterface. 
' Communications interfaces allow software and date to be transferred beiween computer 
Commumca of conBnunications interfaces mclude modems, 

^ and external devtces. ^P'^ communications potts, PCMCIA 

network interfaces (such as, for examp.e, an , Entente, card). *™ interface 
, a ^tholike Software and data transferred via a commumwuu 

J0 can be in the form of signals tnai • t These signals can be provided to 

capable of being received by a commumcations mterface. These signal 
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,-ons interface via a channel capable of carrying signals and can be implemented 
""rr^lotcab.e.nbetop.csctometcommnmcaticnam^um.Some 

interface, and other communications channels. 

As used herein, the terms "computer program medium and «>°>P U 
msdl um" are used to generally refer ,0 media such as a removable 

raP ab.e of inflation in a disk drive, and signals on a channel, or events thereof.™. 

mm nroducts are means for providing software or program mstructmns to 
computer program "f™ ^ logi c) can be stored in 

rr^lS— Computerprogramscanalso.rec.vedvlaa 

^ „ perform me features of the present invention as discussed 

^ams^ien execmed, enable me processor to perform the features 

JZ. Accordingly, in one aspect of me invention, such computer programs represent 

toiplemenred using software, me software may be stored in, or ™™^^~Z 
program product and loaded into a computer system using a removable 
LTTLmunications interface. The comro. logic (software), when 

« ,he nrocessor to perform the functions of the invention as described herem. 

for example hardware component such as P ALs, application specific integral 
™^ ^ware components, totplementation of a hardware s*fe machine so as 

in ye, another embodiment, element are implanted using a combinahon of bom 

hardware and software. 

to anomer aspect the computer-baseo methods cen be accessed or 
• tented over me World Wide Web by providing access via a Web Page to toe metoods 
IT™. — According*, die Web Page is identified by a Universe, Resource 
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« wift a browser ,o se,ect a particuiar URL. which in rum causes the browser ,o seod 
. reques, for that URL or page to the server identified in the URL. TyptcaUy the ^er 
Unds to fire revest by retrieving fte recced page, and trsnsmimng fte da* for ft* 
Z** » the requesting Cient computer svs«m (fire cUentfserver tmeraction t typtcaUy 
formed in accordance with the hypertext transport protocol ("HTTP")). The se,ec,ed page 
I dispiayed to «he user on fte client's display screen. The client may .hen cause ft. 
server containing a computer program of the present invention to launch an apphcatton 
comprising a metitod of me invention, for example, to identity a nucleic acid or a polypeptide 
sequence tha, may be a target for a drug comprising tire s«ps of (a) a firs, nucletc 

M or a polypeptide sequence tha, is known ft be a drag targe, (b) provtdtng an algonftm 
capable anaiyzing a functional relationship between nucleic acid or polypeptide sequence 
Jected from me group consisting of a "domain fusion" method, a "phylogenenc profile 
Zldanda-physiologicUmtage-meftod^d, (c) comparing the fits, nucletc >. - 
peptide dntg ftrge, sequence ft a plurality of sequences using at least one algonftm o 
Lntify a second sequenco fta, has a functional re.ationshi P to me firs, sequence, .hereby 
IdentiJing a nucleic acid or a polypeptide sequenco fta, ma, be a ftrge, for a drug, based on 
a query sequence provided by the client 

Nucleic Acids and Polypeptides 

The invention also provides isolated nucleic acids and polypepttdes 
comprising me sequences as se, for* in Tab,. 3 and Table 4 (below). As used bereft, 
..isoUted," when referring ,o a molecule or composition, such as, e.g., an tsola,.d tnfecftd 
cel. comprising a nucleic acid sequence derived from a library of ft. invention, means fta, 
fte molecule or composition (including, e.g., a ceU) is separated from a, .east one other 
Ipound, such asaproftin.DNA.RKA.or ofter con— ts vdftv^ch ., ,s assorted 
in vivo or in its naturally occurring state. Thus, a nucleic acid or polypeptide or peptide 
sequence is considered isolated when i, has been iso.a,ed fiom any ofter component ™ft 
JL it is naturally associa«ed. An isolated composition can, however, also be subsftntially 
pure. An isolated composition can be in a homogeneous state. It can be in a dry or an 
aqueous solution. Purity and homogeneity can be determined, e.g. , using any analyttcal 
, chemistry technique, as described herein. 
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coding ornon-codmg(e.g., antisens j nucleot ides. The term also 

encompasses nuclccacd-bkc.tmc^ J™ Oxford Univ. Press (1991); Antisense 
Shades, A-mals ^^^^^3 Research m d Applications 

W t^^* l,,,lfc,,,4 

For example, in alternative embodiments, promoiers 

The terms "polypeptide, protein, ana p P 
. • 'Woes" or "conservative variants" and "mimetics 

hUmmM ^ ie amino acid substitotions, additions or deletions of those 

.esrduesU^areno,^ P basic , positively or negatively charged, poiar or non- 
having sumlar properties (eg aedtc, b . P substtntially alter 

Porexample.onee.emp.aryguidelinetose.ec 
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ala/gly or ser; arg/ lys; asn/ gin or his; asp/glu; cys/ser, gln/asn; gly/asp; gly/ala or pro; 
his/asn or gin; ile/leu or val; leu/ile or val; lys/arg or gin or glu; met/leu or tyr or ile; phe/met 
or leu or tyr, ser/thr; thr/ser; trp/tyr; tyr/txp or phe; val/ile or leu. An alternative exemplary 
guideline uses the following six groups, each containing amino acids that are conserve 
5 substitutions for one another: 1) Alanine (A), Serine (S), Threonine (T); 2) Aspartic acid (D), 
Glutamic acid (E); 3) Asparagine (N), Glutamine (Q); 4) Arginine (R), Lysine (K); 5) 
Isoleucine (I), Leucine (L), Methionine (M), Valine (V); and 6) Phenylalanine (F), Tyrosine 
(Y) Tryptophan (W); (see also, e.g., Creighton (1984) Proteins, W.H. Freeman and 
Company Schulz and Schimer (1979) Principles of Protein Structure, Springer-Veriag). One 
, 0 of skill in the art will appreciate that the above-identified substitutions are not the only 
possible conservative substitutions. For example, for some purposes, one may regard all 
charged amino acids as conservative substitutions for each other whether they are positrve or 
negative In addition, individual substitutions, deletions or additions that alter, add or delete 
a single amino acid or a small percentage of amino acids in an encoded sequence can also be 
i 5 considered "conservatively modified variations." 

The terms "mimetic" and "peptidomimetic" refer to a synthetic chemical 
compound that has substantially the same structural and/or functional characteristics of the 
polypeptides of the invention (e.g., ability to bind, or "capture," human antibodies in an 
ELISA) The mimetic can be either entirely composed of synthetic, non-natural analogues of 
2 o amino acids, or, is a chimeric molecule of partly natural peptide amino acids and partly non- 
natural analogs of amino acids. The mimetic can also incorporate any amount of natural 
amino acid conservative substitutions as long as such substitutions also do not substanually 
altermemimetics' structure and/or activity. As with polypeptides of the invention which are 
conservative variants, routine experimentation will determine whether a mimetic is withm the 
25 scope of the invention, i.e., that its structure and/or function is not substantially altered. 

Polypeptide mimetic compositions can contain any combination of non-natural structural 
components, which are typically from three structural groups: a) residue linkage groups other 
than the natural amide bond ("peptide bond") linkages; b) non-natural residues in place of 
naturally occurring amino acid residues; or c) residues which induce secondary structural 
30 mimicry, i.e., to induce or stabilize a secondary structure, e.g., a beta turn, gamma turn, beta 
sheet, alpha helix conformation, and the like. A polypeptide can be characterized as a 
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M when all or some of its residues are joined by chemical means o*er*an natural 
"Ids. .ndividua, peptidomimetic residues ean be joined by peptide bonds. o*er 
remical bonds or coupling means, such as, e.g.. glutaraldehyde, ^ 
estera, Afunctional nral.hrad.s,N,N^^ 

^propylearbodlnnideCDIC). Linking groups tbat can be an altemanve «o *e — 
amide bond ("peptide bond") linkages inchrde, e.g., ketomethylen. W 
C(-0>NH-) aminomethylene (Cft-NH). ethylene, olefin (CH-CH), .titer (CHt-O), 
tbioether (CH,S). tetrazole (CN.-), thiazole, retroamide, thioamide, or «•<*■•*• 
Lola ( 983) in Chemishy and Biochemist of Amino Acids, Peptides and Protems Vol. 
TSS-J "Peptitie Backbone Modifications." MareeU Dekker, NY). A po.ypept.de can 

— aUorsomenon-na^ms^Uceof 

• „_ij residues- non-natural residues are well described in the 
naturally occumng ammo acid residues, non oa.u- 

scientific and patent literature. 

the invention comprises nucleic acids comprising sequences as set forth in 

Table 3, or comprising nudeic acids encoding the polypeptides as se, 
operab.ylinkedtoan^scripfionalregtfiatoryseouence. As used herein, the «rm operably 

L*.- «— » • f°»<* oMi ***** betwKn wo or moie nuc,e,c ( * • ° 

segment Typically, it refers to me fimctiona. relationship of a n^criptiona, ragula»ry 
seqlce to a transcribed sequence. For example, a promoter (defined below) is operab.y 
Jed «o a coding sequence, such as a nudeic acid of me invention, if i, stmrulates or 
adulates the transcription of the coding sequence in an appropriate host ceU or other 
expression system. Generally, promoter transcriptional regulaory sequences mat are 
operably linked to a transcribed sequence are physically contiguous ,o me nanscnbed 
sequence, tiiey are cis-acting. However, some transcriptional regulatory sequence* such 
, renhancers, need not be physically contiguous or located in Cose proximity coding 
sequences whose transcription they enhance. For example, in one embodiment, a prom «e : - 
oplbly linked to an OPJ-containing nucleic acid sequence of me invention, as exemplified 
by e.g., a nucleic acid sequence as set forth in Table 3. 

As used herein, the term "promoter" includes all sequences capable of driving 
0 transcription of a coding sequence in an expression system. Thus, promoters -edrnthe 

eo^cts of tire invention inchide cfc-acting transcriptiona. control dements and regulatory 
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are involved in regulating or modulating Ihe timing and/or rate of 
sequences ma. » -vo £ ^ . promoter ^ ta . ciMCtmg 

^npuon of a nuc «. ™* « _ . Option terminer. 

"1EC£^^~-- — 5 w r - — d Kgions ' 

„ ong m of rep~ ^ ^ ^ 

• • . nuances as set forth in Table 3, or comprising nucleic acids encodmg the 

r 0 » 1, constitutive* or inducibly, in any ceU, inducing prokaryotic yeasUunga.. 
The «rm inciudes expression systems lhat remain episomal or mtegrale ,mo Ihe host ceU 

rrznz: s^:— — ■ 

nololo J) using me exemplary nucleic acid and protein sequences of me invention, 
TTVea lose set forth inTab.es 3 and 4. In alternative aspects of ihe mvennon, 

or £ sequence identic (homo.ogy) lo die exempt sequences as se, 

.„ which test sequences are compared. When using a sequence comparison 

2 designated, if necessa*. and sequence a^ornhm program parameter are d« gna^ 
, , are used unless alternative parameters are designated herein. 
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^ re.ativ. to ft. reference sequence, based on the design** or default program 
"T^ A comparison window", as used herein, includes reference U> a segment of any 
' TlZZ ^condg-ous positions seiecred from fte *oup consisting of from 25 .o 

Z he Spared ft a reference seouence of fte same number of conhguous posttions aft« 
^0^ are optimally atigned. Meftods of aliment of seances ^nrpanson 

^Lwninfteari. ^^'^^-^J^ZS 
. g , by fte .oca. homo.ogy algorithm of Smith A ^ennan, Adv. Appl. Maft . 2A*2 98, , 

by ,he search for similarity meftod of Pearson * Lipman, Proc Nad. USA 
85 2444 (1988), by computerized implementations of ftese algonftms (CLUSTAL, GAP, 
BESTKT, FAST A, and TFASTA in fte Wisconsin Genetics Software Package, Genres 
c!lZ Group, 575 Science Dr.. Madison, WI), or b, manua. alignment and vsua, 

iDSPeCt ' 0n ' in one aspoctofthe invention On the meftods of the invention, and, to 

! , fte CLUSTAL W program, see, e.g.. Thompson (1994) Nuc. Actds Res. 22.4673^680. 
H^ins (1996) Meftods Enzymo. 266:383-402. Variations can also be used, such as 
SIl X see leamnougin (1998) Trends Biochem Sci 23:403^05; Thompson ^ 
Leic Acids Res 25:4876-4882. In one aspect, fte CLUSTAL W program deseed by 

window size: 5. scoring meftod: percent, number of ftp diagonals: 5. gap penalty. 3. to 
TlLe whefter a nuc.de acid has sufficient se q ue»ce identity ft an exemp.ary sequence 

the scope of the invention. This program creates a multiple sequence ai gam 
the scope nroBressive pairwise alignments to show relationship and percent 

related seauences using progressive, p<mwi^ 

1 create fte alignment. P,LEUP uses a simplification of fte progmsstve 
0 of Feng & Dootittle, J. Mol. Evol. 35:351-360 (1987). The meftod nseti ,s snmlaMo fte 
ml" describedbyHigg i ns & Sha I p,CAB I 0 S 5:,5,-,53(.989, Using PILEUP, a 
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referee sequence ( e.g., an exemplary OCA-associatcd sequence of «he invenaon) is 
Lpared ,o another seouenee «o de,ennine *. p^cen, sequence idenury re auonshrp ( ,, 
« L second sequence is subs— identica. and within «he scope of tire mv^on) 
l g ft. following parameters., defauU gap weigh, (3.00), defau.. gap leng* we.gh« (0,0), 
and Lighted end gaps. In one embochmen, PILEUP obtained from *e GCG ^uence 
analysis software pacxage, e.g., version 7.0 (Devereaux(1984) Nuc. Ac* Re, 12.387.395), 
J, rhe parame«ers describe* therein, is used in Ore metirods and to idennfy nucktc nerds 
witinn the scope of Ore invention. In a another aspec, a BLAST algorithm rs us*. 0- ** 
rnetiiods, eg., to determine perccu, sequence identity (i.,, —a, snotia^yor i en«y) 
and whetirer a nucleic acid is within the scope of rhe invenbon), see. e.g. Afcschu! 1990) J. 
Mol Biol 215:403-410. Software for performing BLAST analyses is publicly available 
rnrough rhe Hadonal Center for Biotechno,ogy Information. NIH. Tnis ^ ™°*Z 
firs, identifying high scoring sequence pairs (HSPs) by identifying short words of fcnphW 
in *. query sequence, which either ma te b or satisfy some positive-valueti mresho d score T 
w hen IJed witi, a word of me same lengtir in a dambase sequence. T rs referred » as me 
neighbored word score mreshold (AUschul (.990) ^«). These utitia, 
word hits ac, as seeds for initiating searches ,0 find longer HSPs contauung mem. The word 
bits are men extendeu in bom directions along each sequence for as far as tire cumulate 
aUgnmen, score can be increase*. Cumulative scores are ca.cula.ed using, for nuCeobd^ 
seqTnces, me parameters M (reward score for a pair of marching rescues; always > 0) and 
N (penalty score for misma,ching residues, always < 0). For amino acid sequences a sconng 
^ ' extension oftiie word hiBm each direcbon 

matrix is used ,o calculate tire cumulauve score. Extension or 
are bahed when, me cumulative alignment score fans off by tire quamriy X from its 
maximum achieved value; the cumulative score goes to zero or be.ow. due to tire 
accumuUtion of one or more negative-scoring residue alignments; or.be end of enher 
sequence is reached. The BLAST a.gori<hm parameters W, T, and X determine tire 
sensitivity and speed of me alignment. In one embodiment to de«emtine if a noclere acrd 
sequence is within me scope of me invention, me BLASTN program (for nuc.eoude 
sequences) is useti incorporating as defau.. a wordlengm (W) of 1 1 , an =^-uo^)of 
, I0 M-5 N=4, and a comparison of bom shands. For amino acid sequences, me BLASTP 
pmgram uses as defauU pre— a worolength (W) of 3. an expectation (E) of 10, and me 
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BLOSUM62 scoring matrix (see, e.*„ Henikoff (1989) Ptoc. Nafi. Acad. Sci. USA 
89:10915). 

Hybridization for Identifying Nucleic Acids of the Invention 

Nucleic acids within the scope of the invention include isolated or 
recombinant nucleic acids that specifically hybridize under stringent hybridization conditions 
,o an exemplary nucleic acid of the invention (including a sequence encoding an exemplary 
polypeptide) as set fonh in Tables 3 and 4. Stringent conditions are sequence-dependent and 
will be different in different circumstances. Longer sequences hybridize specifically at 
Mgber temperatures. An extensive guide to tire hybridization of nucleic acids is found m, 
e g Tijssen (1993) infta. Generally, stringent conditions are selected to be about S to 10 C 
,„w'er titan tire thermal melting point (Tm) for tire specific sequence at a defined tonrc 
strength and pH. The Tm is tire temperature (under defined ionic strength, pH, and nucletc 
acid concentration) a. which 50% of tire probes complement^ to tire taxge, hybridize » the 
targe, sequence a, equilibrium (as the target sequences are present in excoss. at Tm, 50 /. of 
the probes are occupied a, equilibrium). Stringent conditions will be those in whrch the sal, 
concemration is less than abou, 1.0 M sodium ion, «ypically abou, 0.01 ,o 1.0 M sodturn ton 
concentration (or otirer salts) a, pH 7.0 ,o 8.3 and tire temperature is a, leas, abou, 30 C for 
shot, probes (e g., 10 to 50 nucleotides) and a, leas, abou, 60-C for long probes (e g greater 

50 nucleotides). Stringent conditions may also be achieved witir tire addition of 
desttbUizing agents such as formamide. nt 
For selective or specific hybridization, a positive signal (e.g., tdenttfication of 
a nucleic acid of the invention) is abou, 10 times background hybridization. "Stringent" 
hybridization conditions tira, are used to identify substantial identical nucleic actds wrtinn 
the scope of tire invention include hybridization in a buffer comprising 50% formamrde, 5x 
SSC and 1% SDS at 42°C, or hybridization in a buffer comprising 5x SSC and 1% SDS a, 
65»C,botirwiti,awashof0.2xSSCand0.1%SDSa,65'C. Exemplary "moderately 
stringent hybridization conditions" include a hybridization in a buffer of 40% 
MNaCl andl%SDSa,37-C.andawashinlXSSCa,45»C. Those of ordmary stall wtil 
readily recognize tat alternative bu, comparable hybridization and wash conditions can be 
rftad to provide conditions of similar stringency. Nucleic acids which do no, hybridize to 
each other under tfringen, hybridization conditions are still substantially identical if tire 
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5 « se, forth the conditions «hat determine whether a nuc,..c acrd 

Tp. of .bo —a Wash conditions nscd . identify nuCeic acids d. 
tendon inciude, c,, a salt concentration of about 0.02 molar a, pH 7 Jj-^ 
rf « ,east about 50°C or about 55'C ,0 about 60-C; or, a ^ ^ ~ / ^ 
NaCl at 72'C for about 15 minutes; or, a salt concentration of about 0.2X SSC at a 

concentration of about 2X SSC containing 0.1% SDS at room temperatine for I S mmu.es 

^vaient conditions. See Sambrook, Tijssen and Ausube. (see below) for a desenptton 
15 SSC buffer and equivalent conditions. 

M ucieic acid and polypeptide sciences of the invention an, other nuc.eic 

acids use, ,o practice this invention, whether SNA, cDNA, genomic DNA, vectors -uses 
Hybrids thlf. may be iso.ared ta a variety of sources, geneucally engu-r^ 
20 Lp itied, and/or expressed recombinant*. Any recombinant expression system canbe 

— onsystems^ , ^ nuo.eic acids and poiypeptides can b^d^^ 

u ^nthesis techniques, as described in, e.g., Carruthers (1982) Cold 

bv well-known chemical synthesis teenmque*. « in «.*«.. 
by weii *n 47.411.4i8- Adams (1983) J. Am. Chem. Soc. 105:661, 

Belousov (.!»".> . • _ ,,. 7 e«x7sofi Narani!(1979)Meth.Enzymol. 

19-373-380; Blommers (1994) Biochemistry 33.7886-7896, Narangi ' 

Lo; Brown (1979) Med, W 68,09; Beaucage (.981) Tetra. U* 22,859. U.S. 
mucous in seances, subclonmg, labeling probes, se q uencing, hybridization and me Idee 
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a« well described in .he scientific and paten, literature, see, e.g., Sambrook, ed 

Laboratory, (1989); Current Protocols in Molecular B,olocy, Ausubel. ed. John Wrley 
& Sons inc., New York (1997); Laboratory Techn,ques M Biochemistry and 
Molecular Biology: Hybridization With Nucleic acid Probes, Pact I. Theory and 
Nucleic Acid Preparation, Tijssen, ed. Elsevier, N.V. (1993). 

Polypeptides and peptides of the invention can also be synthestzed, whole or 
in pan, using chemical methods well known in the «t See e.g., Candhers (1980) Nucleic 
Acids Res Symp. Ser. 215-223; Horn (1980) Nucleic Acids Res. Symp. Ser. 225-232; 
Btutga, A.K., Therapeutic Peptides and Proteins, Formulation, Processing and Deuvery 
Systems (1995) Technomic Publishing Co., Lancaster, PA For example, peptide aynthests 
can be performed using various soUd-phase techniques (see e.g„ Roberge (1995) Scence 
269:202; Merrifield (1997) Memods Enzymol. 289:3-13) and automated synthesis may be 
achieved, e.g., using the ABI 431 A Peptide Synthesizer (Perkin Ehner) in accordance wrth 
the instructions provided by the manufacturer. 

The skilled artisan will recognize that individual synthetic restdues and 
Peptides incorporating mimetics can be synthesized using a variety of procedures^, 
methodologies, which are well described in the scientific and paten, Uteramre,^., Orgamc 
Syntheses Collective Volumes, Oilman, et al. (Eds) John Wiley & Sons, Inc. NY. 
Poiypeptides incorporating mimetics can also be made uaing solid phase syntheuc 
p^Ls, as described, by Di Marchi, e, a.., U.S. PaL No. 5,422,426. Peptt.es and 
w ndemimeticsofmemventioncsnalsobesynmesizedusingcombina,onal 

Lthodologies. Various techniques for generation of peptide and peptidonumetic ubrnnes 
are weUknown, and inc.ude, e.g., multipin, tea bag, and split-couple-mix teehmques; see 
. , al-Obeidi (1998) Mol. Biotechnol. 9:205-223; Hruby (1997) Curr. Opm. Chem. B.ol. 
1*4-1 19; Ostergaard (1997) Mol. Divers. 3:17-27; Osheah (1996) Methods Enzymo . 
267:220-234. Modified peptides of the invention can be further produced by chemical 

(1995) Free Radio. Biol. Med. 19:373-380; Blommers (1994) B.ochemtstry 33.7886-7896. 

Peptides and polypeptides of the invention can also be synthesized and 
expressed as fusion proteins with one or more additional domains linked thereto for, e.g.. 
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facing a more immunogenic P*-* - — —» «— 8 T Tl^^ 
P " ' identify and iso.a.e antibodies and at.tibody.xpressing B ceUs, and me hke. 

" .^dine tracts and histidine-tryptophan modu.es « allow P^™™^ 
TeLs, prcein A domains «hat auow purification on unmobiuzed immtmo g .ob,tiu., and tire 
domain utilized in the FLAGS etfension/affinity purification system (Immunex Corp, Seatile 
WA). The inclusion of a cleavable linker s^uences such as Factor Xa or cnteroku^sc 
nnvLgen, San Diego CA) between the purification domain and GCA-assocraud peptide 
polypeptide enn be useful «o facilitate purification. For example, an expression vector can 

Z a tiuoradoxin and an emerokinase cleavage site (see e. g ., Williams (19,5) B.ochemrstiy 
34.,787-1797- Dobeli (1998) Protein Expr. Purif. 12:404^14). lite histidme resrdues 
faciti^e deletion and purification while «he emerokinase cleavage site provides a means for 
purifying me epitope from me remainder of the fusion protein. Techno.ogy « - 
vZTlcodlng fusion proteins and application of fusion pro.n. are we.l deseed m *. 
scientific and paten, literature, see eg., Kxoll (1993) DNA CeU. M. 12:441-53. 

The invention provides antibodies that specifically bind m me polypeptides of 
ft. invention, as se, form in Table 4. These antibodies can be useful in the scraening 
met hods of tire invention. The poiypeptides or peptide cat. be conjugated «> anotirer 
mo ,ecule or can be admini*ered with an adjuvant The coding sequence can be par, of an 
expression cassette or vector capable of expressing tire immunogen in v/vo (sec, e.g., 
Katsumi (.994) Hum. Gene The, 5:1335-9). Methods of producing polyclonal and 
monoclone antibodies are known to those of skill in the art and described m me sctentific 
and paten. Iherature, sec, , g „ Coligan, Current Protocols » Immunoloov 
Wiley/Greene, MY (1991); StiKs (eds.) BASIC AND CUN.CAL IMMUNOL (7m ed.) Lange 
M^hUcatii Los Al.os, CA; Goding, Monoclonal Ant30 D1 bs: Proles aho 
Practice (2d ed.) Academic Press, New York, NY (.986); Harlow (1988) ANHBOD.ES, a 
L.BORATORYMANUAuColdSpringHarborPublications.NewYork. 

Antibodies also can be generated In Wrro, e.g., using recombtnan. anubody 
, binding site expressing phage display libraries, in addition .o me traditional in Ww, methods 
^Lna.s. See,e.g.,Huse(1989)Sc i ence246:1275;Ward(1989) N anne341:544 ; 
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u .nboom (1997) Trends Biotechnol. 15:62-70; Katz (1997) Annu. Rev. Biophys. 
Hoogenboom (1997) iren 2enerat ed in mice engineered to produce 

i Q*«,rt ->fvT7-45. Human antibodies can be generaicu m & 
Biomol. Struct. 2b.t nun 

only human antibodies, as described by. e.g., U.S. Paten. No. 5,877 397 
^ 9,650; and 5.939,598. B-ceUs front these ntice can be tan— a 
« Mq ues (e.g., by .using with an hntnotlalizing ce U Una such as a myelotna or by 

j * „^»u c«>» *» <t u S Patent No. 5,916,771, 5,yo3,oi3. 
monoclonal human antibody-producing cell. See, e.g., U.b. rare 

TABLE 3 

> R vC002 dnaN DMA polymerase III, b-subunlt 7B.aeq 20623257 MW:421 14 

AT^CGCOGCTACGACAAGAGTTGG^CCGAOTTG sstoccGGTGCTCT 

^CGTGTTGJTG^GGCTCGGAC ^ ^^^ctggcoGATTG 
CGCCGAGGCCCAGGTTGGCCCTG^TTGTrr GACGTTCATGTC eAAGGTAACCGGG 
. -GTCCGATAT^GGGC^^^ 

TCGCATTGACCTGCGGTAACGCCAGGTrru- ATTC GCCGAGGCAATCAG 
GCTGCCGACGCTGCCGGAAGAGACCG^T^CT^^^ GGOATC CGGGTCGA 
^^^^^^ g ^O^q ag qqjqqY^P^G^^GC1^^CG^^^GT^CGCCTGGCTGTTCGAGAACTG 

AATCCTCGGTGAGACGGTGGTTTl ^ __ _ _„ r-Tfsr^THCCGGCC AAG ACGCTGGC 

„ AAGTGGTCGGCGTCGTCGCCAGAT^CGAAGCGG^^GT^CT^GT^^^GGC ^^^^^^^^^ 

CGAGGCCGCCAAAGCGGGCATCGG^^ 

cctatctaacggacggtttgag™gttgcgct^gag^agtgtcttt tgaatgg 

30 GGCTGA 

>RvO003 recF DNA replication and SOS Induction 7B.seq 3280:4434 MW:42181 
> e mb,AL123456|MT B H37RV:3280^437. recF SEC HO GTAGATCTGGAATrGC 
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s GATCCCGCTGACCGGCGGCGCTATCTGGATGA^^TGGCG^l^^^ ^^^^ 
GCGGTACGAGCCGAATATGAGAGGGTG^GC^CO^C^GACGG^^ ^^^^ 
GGAGCACGGTATCGGGGTGACCGGGGTGTGT^GAC^^TCTl^^GGT^^^ 

GC GGAGCAC^GGCT = GTGGCC = 

AGTGAAGAAGGWTACCAGCTGTTGGCG^CCGGAATC AGCAGCXCGG 
10 CAGCATGGATGTAAGCGGTCCCAGCGAGCAGTCAGATATCGA^^^C^W^^^ ^^^^ 

VCTGCGGTTGGCGGCCTATCAACTGTTACGCGTTGATG 



CCGCACCGTGACGACCTAATACTGCGACTAGGCGA1 

GGGAGGCGTGGTGGTTGGCGGTGGO^^^^^^^^^^^^^^^^^^ t ^^q 



iTGGTGAGCCGGTGTTGTTGCTCGAC 



CAGGTGTTGGTGACTGCCGCGGTGCTCGAGGAT 



GATGTCGGTGGTTCTGCCATGA 

>RV0005 9 yrB DN A g ^sesu b unn B mse q . 5 ,23 : 72«MW.7 M 41 

20 TGAGGGTACAGTGGTGTGGGAC 
ATGGGTAAAAACGAGGCGAGMGATC^GCCCTGGCG ATC GTGGCTGCC 

CCCCT GGGG r AGTGAAGGGC == 

CAGAAAAAGAAGGCCCAAGACGAATACC^ CCGGTGAGCGCGGTTTAC ACCATCTC 



GCCGTCCGCAAACGTCCCGGCATGTACATTGGCTOGA' ^ 
25 ATTTGGGAGGTGGTCGACAACGCGGTCGA^ 



AGTGAA" 



lC GCGCAGCCGAATCCACTGCACCGCA( 



ATrC^GAGTCGGTGC^ACCT^CGCCAACACCATCAACACCCACGAGGGCGGCACCCACGAAG 
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TGAAGGTCAGCGAACCGCAGTTCGAGGGCCAGACCAAGWSCAAGTT^^^^^^^^^^^^^^ 

10 GGCGCTGGGCACCGGGATCCACGACG^GTTCG^^ 

GCTGATGGCCGACC^CGATGTTGACGG^GATAT^CGCTG CTGTACAAAC 

15 GTAGGTGAAATGGACGCTAAGGAGTTGTGGGJ^ACCACCATGGATC^^^^^^^^^^^^^^^ 



20 



.RV0006 gyrA DNA gy^esu^ A TB.,^ 

ATGACAGACACGACGTTGCCGCCTGACGACTCGCTCGACCGGATC ^^^^^^ 
CAGGAGATGCAGCGCAGCTACATCGACTATGCGATGAGCGTCATCGTC ^^^^^^^^ 
G^GGTGCGCGACGGGCTCAAGCCCGTGC/^C^CCGGGTGCTCTAT^CA^^^^^^^^^^^ 

25 TTCCGCCCGGACCGCAGCCACGCCAAGTCGGCOCGGTCGG GGCCCAGCCCTGGTCGC 
CCACCCGCACGGCGACGCGTCGATCTACGACAGCCTGGTG^A GACCCACC GGCG 
TGCGCTACCCGCTGGTGGACGGGCAGGG^ 

^^^^^^^^^^^^^^^^^^^^^^GGACGGCCG^GT^^AGAGCCGACGGTGCTACCC 
GAGGAGACAGTCGATTTCATCCCTAACTACGAC CG GTCGGCATGGCAACCAAT 
M AGCCGGTTCCCCAACCTGCTGGCCAACGGGT^AG^C^GCATCGCGG^ ^^^^^^^^^^^^ 

ATCCCGCCGCAGAAC^^^C^^^^^^^ 

GCCGACGAAGAGGAGACCCTGGCCGCGGTCATG aaaaoTGGCCGCGGCTCCAT 
TOCCGGACTGATCGTCGGATCCCAGGGCACCGCTGAT^^CTACAA^CTG ^^^^ 

TCGAATGCGCGGAGTTGTTGAGGTAGAAGAGGAVTCC^GCGGTCGT^CC^ ^^^^^^^^^^^ 
35 CGAGTTGCCGTATCAGGTCAACCACGACAACTTGAT^ 
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CTACCGGCTGCGCAAGGCAAACGAGC^^^-CCACATn^^^^ ^^^^^^^^^^^^ 

GGCGCCT^3CW5CACTGGAACGCCAGCGCATCATCGACGACCTGGCCAAAATCGAGGCCGAG 
^C GATC -""— ---GCGACGACCGGCGTACCCGGATCATCGCGGCCGACGG 



^GACATCCTGGCAAAACCCGAGCGGCAGCGTGGGATCGTGCGCGACGAA 
GAAATCGTGGACAGGCACGGCGACGACCC 



AGACGTCAGCGACGAGGATTTGATCGCCCGCGAGGACGTCGTTGTCACTATCACCGAAACGGG 
GGGTGGGGGGTTGAAGCAGGACGACATCGTCGCGCACTTCTTCGTGTGCTCCACCCACGATTT 



10 ATACGCCAAGCGCACCAAGACCGATCTGTATCGCAGCCAGAAACGCGGCGGCAAGGGCGTGCA 



^GCGCGCGGGCAGCACGTGGGGAACCTGTTAGCCTTCCAGCCCGAGGAACGCA 



GATCGTGTTCTTCACGACCCAGGGACGGGTTTATGGGGCGAAGGCCTACGACTTGOCCGAGGC 

„ a^gg^^^gtgaa/^agtccaagctgaccgacttcgactccaatcgctcgggcggaatcgtgg 

" r^GCGCGACAACGACGAGOX^^^^^^- 



^^-^Tf-rTCTCGGCCAACGGGCAGTCCATCAGGTTCTC^i-^^^-- 

r^r^T^^TCGT^X^CCTCGGGTGTGCAGGGCATGCGGTTCAATATCGACGACCGGCTGCT 

GTCGCTGAACGTCGTGCGTGAAGGCACCTATCTGCTGGTGGCGACGTCAGGGGGCTATGCGAA 



ACGTACCGCGATGGAGGAATAGCCGGTACAGGGCCGCGGCGGTAAAGGTGTGCTGACGGTCAT 

iTTGGGGC 
&TCCGCAC 

GCGAGGGCGACACACTGTTGGCCATCGCG 



20 ACC3 1 • < - ' ■ " " "3GGGCGTTGATTGTCGACGACGACAGCGAGCTGT 

ATGCCGTCACTTCCGGCGGTGGCGTGATCCGCACCGCGGCA^ 



GTACGACCGCCGGCGCGGCAGGTTGGTTG 
TCACTTCCGGCGGTGGCGTG 

^GCAACGCCG^^G^A^^GGCG^GATAATGCCGTGGACGCCAACGGCGCAGACCAGACGGG 
25 CAATTAA 

>RvO014c pknB serine-threonine protein Kinase TB.se, 15 593:17470 

^^^^^Q^Q^^^^^^^GCi^GCGACCTCCGGTTGCACCGCGACGTTGCGGTCAAGGTGCTGC 
^I^r«CCACCCTGCAATCGTCGCGGTCTACGACACCGGTGAAGCCGAAACGCCCGCCG 

™ttg£™ca^ 

CGAAGGGC^^^TOACGCCCAAACGCGCCATCGAGGTCATCGCCGACGCCTGCCAAGCGCTGA 
„ ^Yj^^^^^CAGAACGGAATCATCCACCGTGACGTCAAGCCGGCGAACATCATGATCAGCGC 

gac^Tgcagt/Tggtgatg 

^TGACCCAGAC^GCAGCAGTGATCGGCACGGCGCAGTACCTGTCACCCGAACAGGCCCGGG 
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OTGATTCCGTCGACGCCCGATCCGATGTC^^^ 
GSGGAGCCACCTTTCACCGGCGACTCACCCGTCTC^r^O^G 

^TI^!^!^^«^^^^^^CGTCGTGGTAACCATCGCCATCAACACGTTCGGCGGCATCACCCG 

T^c^^aSttcggggtcaatcctocgccgacgccatcgccacactgcaaaa 

CGACGTTOAAGTTCCCGACGTT - , CGQACTCGAC AATCCCACCGGACCACGTTAT 



10 



CCGGGGCTTCAAAATCCGCACCTTGCAGAAGO 



GAGTGCAGGCGACGAGATCACAGTCAACGTGT 



CGGCACGGACCCGGCCGC^OGT^ 

CCACOGGACXCGAGCAACGCGAAATA^^^^^^^^ 



AGAAACTGACTGCCGCCGGATTCGGCC 

15 ;^g^=^^ 



rGGGCAAGGTCATCGGGACCAACCCGCCAGCCAACCAGACGTCGGCCATCACCAATGTGG 

CGA 

CGACGTGGCGCAGAAGAACCTCAACGTCTACGGCTTCAOCAAATTCAGTCAGGCCTCGGTGGA 



CCGTCCCGCCGGCGAGGTGACCGGCACCAATCCACCCGCAGGCACCACAGTTCCGG 

CGCGCGCTGGGCTGGACCGGGATGCTCGACAA 



TCGATCAGT^^ 



20 sssss^^ 

20 CGGGGACCGGCGTCAACCGGGACGGCATCATCACGCTGAGGTTCGGCCAGTAG 



>Rv0016c pbpATB.seq 18762:20234 MW:51577 
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30 
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>KVUunoc (jup^-k i lj.^w^ 
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TGCGTGAGGCATTCGTCAAATCATGCAACACCGCATTCGTCCAGCTGGGCATCCGCACCGGCG 

rr^CGCCCTGCGCAGCATGGCGCGCGCGTTCGGTCTCGATAGCCCACCGCGCCCAACTCCG 

^GCAAG^TOGCGGAATCAACCGTCGGGCCTATCCCGGACAGCGCCGCACTAGGGATGACCAGT 

ATCGGCCAAA^G^ACGTTGCGCTGACCCCGCTAGCGAACGCAGAAATAGCCGCGACCATCGCA 

f^^^ACGATGAGG^rrATCTAGTCGC.CAGCCTCAAGGGACCG^CCTA^CAAT 

ATCTCAACCACCei ^ ^^^^gn^oAGCAGAAAGGGGOMTCCCCGGCGTGCA 

TTPGT^ATCGCCTrrcCGCCCGCACAAGCGCCCAAGGTGGCTGTTGCCGTGCTGGTGGAGAA 

ATCGAAGCCGCACTGCAGGGGGAACCATGA 
>Rv0017c rodATB.seq 20234:21640 MW:50612 

AA1 
GG 
GT< 
AT< 

TGCCTTC^o « - ■ — ■ ~ .GTTCCCGCGCTGCTCCCGGCAGCACTGTCCGAA 

MTTTCAA 



^TGGGACTTGACTAGCTACGGACTGGCCTTCCTGACCCTGTTC 
GCCCCCTACACTGACCCGCTGTTGCTCCCGGTG 
ATGATCCACCGCCTCGATCTGGTGGACA 
JCAGCAGATGCTGTGGACGCTGGTGGGC 
3GACCACCGACAGCTCGCACGCTACGGC 
.GTTCCCGCGCTGCTCCCGGCAGCACTG 
CAGAACGGCGCCAAGATCTGGATCCGGTTGCCCGGCTTCTCGATTCAACCCGCCGAAl 



AATCAAGACCAGGGGGTGCCC 
GG 

GTGGCAU i ^ i - \GATGCTGTGGACGCTGGTGGGCGTAGC 



TGCGGGCTCGCGGGTCTGGTTTTCTTGGCAC 



AGATTCTGCTGCTGATCTTCTTTTi 



CGGCGGTACTGGTGGCCAAACGCGGCCTGTTCACCAGCGC 



JGGTTCAGTTGGGTCGTCATCGGCCTGA 

TCTG<J | <j<j I OO I I i r^ww • — 

GTCTGTTCGCGGCAGGAAGCTTGGTGGCGTACTTCATrrn 

g.cctg^tggatccg^cgcagatccagacggca^^^^^^^ 



ACATCGTrrCTGOTGGTGGTTTACCTCGCCACCCAGC^^^^^^^^^^^^^^^^ 
CTGGATCCGTTCGCAGATCCAGACGGCACCGGATATCAGATCGTGCAGTCGa 



2TTTTC 



GATAGCTTCGGCAAGCTGCTGG 
TCGTCGGCGGTGTGACCCGACTCATTCCGCT- 



GCGGGTCTTCACTGCTGGCCAACTACA 



GACCGGGTTGACCACACCGTGGATGTCCTACG 
TATTGCTGGCCATCCTGGCACGCATCTCGCACGGAGC 



CCGCCGCCCACTGCGCACCCGCCCA 
TCATCGAACGCGTATGA 
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>R»0016c PPP TB S« 21640:23161 MW.53781 

>^^»|™ H3 ™ : * 3181 - 216 *^^^^^ 

GTGGC ~=^^^ 

CAACAACGAAGACTC^TCTACGC rJkQTTGGTGATTGCCGC ATTG G CCCATCTCGATGAOG 



GGCATGO 



GGCCGGCGAGGTGGCGTCCCAGTTGGTG 



ACGAi 



ATCGG 



GCCCGGTGGCGATCTGCTGGCCAAGCTG 
AGCGCAAGTCGAGATGGAGCCCGATCTC 



GA 



TGCCGCGGTGCGCGCCGGCAACTCGGCT 



iTTCGCGGGCAACCGGCTCGGCCTGGTGCA 



GAAGGCATGGGTACCACGCTCACCGCAATC 
TATCGGTGACTCGCGCGGTTACCTGCTGCGC 



^^^rAc'^ 

cggatcaccccggaggaggcgcacagcc*^^ 

CGGCCATGAGGTCGAACCGACGCTGA^ATGCGAGAAG^^G^GCCGG^ 

CGAGGTTGCCGAGAGCGCTCACCGC ^ ^c^qacocaACCGATTCTGG 
AOGTCACTGTCGTCGTCGCCGACGT^TC^CTACGAC CCGCC GGCCGGGCC 
CCGGGGCGGTCTCAGGCGAO^W^Gl^ACTGAOO^rcCCC^^^^^^^^^^^^^^^^^^^^ 

TOTGCCATCAGCCAGCGCAAGGAGATCGnAAAOGCe GTGACCGTG CTGATGACTG 
GGCCACGGTGGTCGGGW^A<^^CTA^CA^rcGTTGTCGC>^^^T^^^^^^^^^^^^^^ 

CGGGCCTGCTCATTGGTCGCGCGATCATCC^ GCACCAGCCTTACC 

CCGGTCTCCCGGCCGGCACTCTCGA^^^^^^T^G^GC^G^n^^G^^AA^TG^^^^^^^^^^ 
ACCACCAGCGAGAOAACCGAACCAA^GTCACCTCCT^ qqcAGCGCCC 

CCCGCCTCAGCCGGGCATCGACTGCCGGGCGGCGGCATGA 
>Rv0019c-TB.seq 23273:23737 MW.17153 



^^^^^^^^^^ 

^^^^^^^^CGTATCACGCTGAGCGAACAGCCGGTGTTG 

,CCCTGGTGCTGACCGACGACTACGCCTCGACGCGGCACGC 

TCGGCTGTCTATGCGCGGCTCCGAGTGGTACGTCC 



, 5 CTGGTGGTGACCGAAGGTGCGTTGACTGGCGO 

ATCGGGCGCGCCGACGACTCGACCCTGGTGCT — ; gaa gaTCTAGGATCGACCAACGGCACTTA 



52 



<WO 0135317A1 I > 



PCTAJS00/31152 

WO 01/35317 

COTOS.CA^^OOTO^CTOCO.T^TTCCGATC^-C.CCaSTTCOCATCC 
GCAAAACTGCAATCGAGTTGCGCCCGTGA 

>Rv0020c-TB.s«, 23864:25444 MW:56881 

>rvuu^ Rv0 020c SEQ ID NO:10 

OCGTTTGCCCGCATCT^GGAGGC^^ 

GAGGCGGCCGACGGCATCCAGTCGCTG^AGGG^ ^^^^^^^^^^^^^^^^^ 
ArTACCCTCGGTGTGGACGAC^GAA GGTGGCAAAC GTATGGTGATGTGGTCGT 

CCGATTCGAGCAGTCGTCG^GCA^ CGCCCGGC CACAATCAAACCACGCGTTTGG 
*^^^^^^^^ < ^^^ G ^^ GG ^^f G j^TGACAATTCGAGCTACCGTGGCGGTCAGGGGCAGGG 
CGCAGAACCAGGAGTAGGACCAATGAGT^^ ^TOCqOSTCCGCAAGAGGATCCGCGTGGTGGCCC 
SCGTOOCGACGAG^GAO^^^C AGACGGGCGGCTACCCGC 
, <^CCGCAAGGGGGA^ 

OCCAGCCGGGCTAOCCA^C^CGCC^ CGGTrACCCOGAGCA ACGGGGGTACCCGGA 
TACCCCGACCAGGGCGGTTACCCCGAGCAA ^ AGGGGGGCT ATCCGC 

TACGGCGGGTACGGGGAGTACGGGCGTl^. rcccGACC AAGGCGGTTACGACC 
CTCTGGCCCTCCGGGCCCGCCC^AGCAACGACCGGC^^AC^CCG^^CAAG ^^^^^^^^ 

ACCCGCTACACCGAATCCCCGCGGGTO= GGAC TACGGTCAGCCAGCGC 
!5 ACCCGCCGGCCGAGACTACGACTAC^^^^y^GAGG^ ^^^^^^^^^^^^^^ 
CCGGTGGCTAOAGCGGTTACGG^G^CGGCWGGG ^t^tcgGTCGC 

CATCGTCCGCATGCACTGA 

>Rv00 3 2 «oF2 C-^^»B.^ fe B^m^342 9 6: 36 807MW:8624 5 
" ^^^^T^rnCM^^^G^ATnGG^CGATGGGCAACCGCTAGGCCGGGCAAACCTCTATAG 

^cxg^cgT^ 
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TCGACGAGTATGCCGAGCATGCGCCGGTA" 



TTGGCCCGGCTTTGGCGCAACGTCAAGACGGAGG 



CAAAGGATTACCAGCGCGAGGACCTGAACCCTGA 
TGGACGTAGCAGACTGTGGTTC 
10 GGGGTGCGGATGAGAACTACA* 



GTTCTTCGCGGCGTGTTCTCGGCATCTGCA 



GTTGTTCCGCTACCAGGGCACGCCAATTGCCTTCTTTTTGAACGTTT 



TOGACGTAGCAGACT^ 
GGG GTGCGGATGAGAACT^ 

^gaS 

ATACCGAC 



eCATCTATTTCCTGCGTCACAGCACGGATCCGGTGCATACGGCAACGTTAGCGCGAA 



G^G^^ 



rGGGCCTGGCCACCCATCCAGAGGTGGTGGAC 



TCGTATCTG^,-^— ^ GAACGGCACGTTGGACTTGC ACGTCTCGCT 

20 ^ygcGAATCCGGGGACATGATCATCCAAGACGCGCTG 



GTA CGGCACCGGCTGCT = G^^ 



AGCAACCTGGCGGCGATCAGCGCGCTAT 
AACCACCGCAGCCTGTTCGACGCCGCCAGGTTGT. 



CCGGGGCCGACTTCACCTTGTACCGGCAC 
CGCCGCACCGAGGGGCGCCGCCGGATCATCGT 



25 



AACGACATGGACCACCTGGCGCGGGTGCTAi 
CGTGGACGCGGTGTTCAGCATGGAAGGCACCG 
CCGACCGGCACGGCTGCCGGGTCTATGTGGACGA< 

GACGGGCGAGGAGCTTCGGCCGCGTTGGGTGTCT^^^^^^^^^^^^^^^^^^^ 

SAGCCTGCCGCCGGCCGCCGCGGCTGC 
CATCCGGCAUAM^ • • — JGTGAACCCGACCGGCGGGCTCGGGTGCTGGCCG 



GTTCAGCAAATCCTTTGCCTCCGTCGGCGGGTTCATC 
\TCCGGCACAACGGTTCAGGTCATGTGTrrrCi 



iTCGCCGACCTGGCCACCATCGCCGAGCTTG 
GTCCCATGCGCTGGGCGTGCTCGGCCCC 
CCGCGTTGGGTGTCTTGGCGCGCATGGACGTGGTGATGGGCAC 
CATCGCCGGAGATCGGCCCGTCGTGGACTA 
CGCCAGCCTGCCGCCGGCCGCCGCGGCTGC 



CA< 



GCCACGCGGCTCTGCGCGTCAGTCGGCC 



30 



CGGCCGAGTACATGGCCACCGGCCTGGCACGGCAGG 
GCGATCGTGCCGGTGATCCTGGGCAACCCGACCG 



GCTATCAGGCCGAGTATCACGGAACC 
TGGCGCATGCGGGCTATCTGCGGCTGAT 

CGGCCTTGO 



CGAGGACCTGACCCCGCAAGGAGCCGCGCTATGA 



35 



>Rv0050 ponA1 TB.seq 53661:55694 MW:71119 
>emb|AL123456|MTBH37RV:53661-55697. ponA SEQ ID NO:12 
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^GGTGATCCTGTTGCCGATGGTC^^ 

CAGGTGACATCCGTACCAACCAGGTCTCCA^ 
AAATTGTTCCGC<^GAAGGTAATCGGGTCGACGTCAA^C^^^CCAG^ 

■^^^^=£ 

GGGGCGCCTACGGCATTTCGGCGGCGTCCA 



3CAAGGCGAAAGAATTGGTCATCGCGACGAAGA- 



10 



TGCAGGCGTATCTGAACATCATCTACTTCGGCCC 

^^^^^^^^^^^ 



CCGTTGCCGAAGGGGCGTTGTTGGCAGCGC 



AA 



JTGGGTACTCGACGGCATGGTGG. 



AAACCAAGGCTCTCTCGCCGAATGACCGTGCGGCGCAG 



GTGTTTCCCGAGACAGTGCCGCCCGATCTGGC 



CCGGGCAGAGAATCAGACCAAAGGACCCAAC 



5GGCTGATCGAGCGGCAGGTGACAAGGGA 



GTTGCTCGAGCTGTTCAACATCGACGAGCAGACC 



VTTGATCCGCAGGCCCAACGGGCGGCGGA 



15 GAA< 



CTCAACACCCAGGGGCTGGTGGTCACCACCACGM^^^^^^^^^^^^_^^^ 



GGCGGTTGCGAAATACCTGGACGGGCAGGA 



CGACCCGCACAACGGGGCGGTGCGTGCGTA 



CTACGGTGGCGACAATGCCAATGGCTTTGACTT 



GCTCAAGCGGGATTGCAGACTGGATCGT 



CGTTTAAGGTGTTTGCTCTGGTGGCCGCCCTTGAG 



AGACAGCTCTCCGTTGACGGTCGACGGCATCAAGATC 



^^^^^^^^^^ 



20 



5AACACCTCCTACTACCGGCTGATGCT 



CAAGCTCAACGGCGGCCCACAGGCTGTGGCCGATGC 



CG' 



CGCACCAAGCCGGCATTGCCTCCAGCTTCCC 



GGGCGTTGCGCACACGCTGTCCGAAGATGG 



====== 

JGGACGGCGCCTTGAAGGGCACGTCG 



JTGGGTGGGCACCGTCAAGGGTGACGAGCCAC 
GCTCGGGCCTGCCGTCGGACATCTGGAAGGCAACCA 



TG 



30 



AACGAGACTTTCCCCAAACCGACCGAGGTCGGT 



GGTTATGCCGGTGTGCCGCCGCCGCCGCCG 



CCGCCGGAGGTACCACCTTCGGAGACCGTCA" 



JCCAGCCCACGGTCGAAATTGCGCCGGGGATT 
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ACCATCCCGATCGGTCCCCCGACCACCATTA* 
GCGACTCCCACGCCGCCGCCGTGA 

>Rv0051 - TB.seq 55694:57373 MW.61210 

>emb|AL123456|MTBH37RV:55694.57376. Rv0051 SEQ ID NO:13 

55 



GCCTGGCGCCACCGCCCCCGGCCCCGCCCGCT 
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rTrACCGGCGCGCTGTCCCAAAGCAGCAACATCTCOCCACTTCCTTTGGCCGCCGATCTGCGG 

CTCGGTGGC^C^TAGGCCGGCACGCGCTGATCGGCCQCACCCGGCTGATGACCCCGCTGCG 
r^A^^TCGCGTT^TGTTCCTOGCGCTCQGTTGGTCGACGAAAGCGGCCTGCTT 

CTAC^AGTro"^^^^TCCGATACGGTGCCGCTCTATGGCGCTGAGTTATTGAGCCAAGGCAA<3 
T^^GT^AAATCAAGCTGGATCGAAACCGACAGCAACGGCACACCGCAGCTGCGCTACGAC 

^g^gg^ 

TG^CGATAGCCAAGACCTACACCGCGTTAAGCAAGGTGGCTCCCCTCCCGGTGGTTGCCGAAG 
T^TGATG^CTTCAACGTCGCCGCGTTCGGTTTGGCGCTGGCGTGGCTGACAACCGTCTGGG 
raACCTCGGGC^TGGCGGGCCGCCGGATATGGGATGCGGCGCTGGTGGCCGCCTCACCGCTG 

Sgatc^t^ta^caccaatttcgatgcgctggcaacgggtttggggacgagtgggctgc 

TGG^CTGG^CG^GGCGCAGACCGGTGCTTGCCGGTGTGCTGATCGGGTTGGGCTCCGCGGCG 
AA^Gra^M^CGCTGTTGTTCTTGTACCCGTTGTTGCTGCTGGGCATCCGGGCCGG^CG^^*GA^ 
1T -- TCTGGCCC ^ 0C ATGGCGGCCGa3GCGGCGACCTGGTTGTTGGTGAATCTGCCGGTGA 
TGCTGCTCTrn^^^CGGCTGGTCGGAGTTCTTCCGGCTCAACACCCGGCGCGGCGAGGACA 

^ACTOG^GTACAACGTCGTGAAGTGGTTCACCGGCTGGCGTGG^ 
GCTTCTGGGAGCCGCCGCTGGTGCTGAACACGGTTGTCACGCTCTTGTTCGTGTTATGTTGTGC 

GGC/^nGCT^^^TCGCGCTCACCGCACCCCACCGGCCGCGCGTGGCGCAGCTGACTTTCTT 

S^G^AGCTTGCTGTTGGTCAACAAGGTGTGGAGTCCCCAGTTCTCGCTTTGGCTGGTG 

CCGCTGGCCGTGCTGGCTTTGCCGCACCGCCGGATCTTGCTGGCGTGGATGACGATCGACGCG 

^QQTG^^^T^CCG^GGATGTACTACCTATACGGCAACCCGAGCCGCTCGCTGCCCGAGCAG 

TGGTTCACCACGACGGTGTTGCTGCGTGACATCGCCGTGATGGTGCTGTGCGGACTGGTGGTC 

TGGCAG^XTAC^GCCCCGGGCGCGACCTCGTGCGTACCGGCGGGCCAGGGGCACTGCCGGC 

^CTG^GGG/^^CGACGACCCGGTGGGAGGGGTCTTTGCCAACGCCGCCGACGCCCCGCCAG 

OTCGGCTACCGTCGTGGCTGCGTCCGCGGCTGGGCGACGAGCATGCGCGAGAGAGGACGCCC 

GATGCAGGTCGCGATCGCACTTTTTCCGGGCAACACCGCGCTTGA 



>Rv0106 -TB.seq 124372:125565 MW:43701 



>en,b|AL123456| M TBH37RV:124372.12556 8 . RV0106 SEQ ID NO:14 
ATGCGTACTCCGGTGATATTGGTGGCAGGTCAGGATCAWCCGACGAGGTGACGGGCGCCTTG 

^G^GCCGGACCGGAACGGTGGTCGTGGAGCACCGGTTTGACGGCCATGTGGTGCGA^GGAT 

GACTGCCACGCTGAGCCGTGGCGAATTGATCAC^ACGGAGGACGCTTTGGAGTTCGCCCACGG 

ctctgTgtcgtgcacaatccgcgacgacctgctggtgctgttacgoagactgcaccgccgaga 

CAATGTCGGCCGGATCGTCGTGCACCTGGCGCCGTGGCTGGAGCCCCAGCCCATCTGCTGGG 
CG^CG^CCACGTGCGGGTTTGCGTCGGACACGGATACCCAGACGGACCA^^^GCCproGA^ 
GTGCGGGTCGCGGCCGTGGTGACCTGTGTGGACTGCG7AAGGTGGCTGCCGCAGTOACTCGG 
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CCOAOCTTCT^TOCTGACCCAC^^^^^ ctggacga 
GGCAGCCGACG^GAGGTTGC^TO^^ 

TCTGCATGCCGC.BGTT^GCTG^^^^^^^^^, 

GGCCAACCGGCCGGATCAGGTC^G^^*^^'^*^^^ ^^^^^.^^^^qq 
CGGCCGGAAAGTGGTTGGCGGC&^GG^GGCCTCGGAGG ^^^^^^^^^^^^ 

CGGTTGTrCGCCGACCTGATGTGGGTCTACCCGTrc TGC TCAGCGAC 
CTGGTATGCGGCGCCGATC^G/^CGAC^CGTCAATG^^CTGAACG^^^^^^^^^^^ 

^^^^^^^^ 

TCTCGATGA 

>Rv0125 -TBseq 151146:152210 MW:34927 

,^123456,^7^:151146.152215 U*pA ^ ^' D G ^ GAGCGTCCTG(3CTG CCGTC 
ATGAGCAATTCGCGCCGCCGCTCACTCAGGTG^rc^GG^GCT^^^^^^^^^^^ 




TCGCCGACT^CG^G^^AC^ 

GTGGTCAACATC/ 
ATCGATCCCAACC 

GCGrrCAGCOTO*,- . -\ GGTGGC CTGCCGTCGGCGGCGATCGGTGGCG 



GTG 3TCAACATCAACACCAAACTG^CTACAAC^ 



^^^^^^^^^^ 

CGGGCGGGCCCGTCGTCAACGGCCTAGGAL^ CGGGCA<3GCGA TGGCGATC 

AACTTCCAGCTGTCCCAGGGTGGGCAGGG^ 

GCGGGCCAGATCCGATCGGGTGGGGGGTCACCCACCGTT^ATATCGG 

GGCTTGGGTGTTGTCGACAACAACGGCAACGGCGCACGAGTC^AACGCG^^S^^^^^^^^^^ 
TCCGGC^CAAGTCTCGGCATOTCCAOOG^^ 

CGGCCTGA 

MW.66832 SEQIDNO-.16 
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>eml 



CTCCGAGGGCTCCAGGACCACCCCGTCAATTGTCGCGTTC 
CGGCCAGCCCGCCAAGAACCAGGCAGTGACCAACGTCGA 



GGCGACCCGGTCGTCGTCGCCAAC 
OA^TACACCGCGCCGGAGAT^G^^OA^GA TACTTCAATGACGC CCAG 

ctacctcggtgaggacattaccgacgcgg™tcacg ^c^cotcAACGA 



CGTCAGGCCACCAAGGACGCCGGCCAGA1 



^CAAGGGCGAGAAGGAGCAGCGAATCCTGG 



10 



GCCGACCGCGGCCGCGCTGGCCTACGGCCTCGAC 
TCTTCG 

GGTCCGTGCU^ . , — ■ — ; agcggcatC GATCTGACCAAGGACAAGATGGCGA 



CGTTTCCCTGCTGGAGATCGGCGAGGGTGTGGTTGA 
CGGCGGCGACGACTGGGACCAGCGGGTCGTCG 



CGATCAACCTGCCCTACATCACCGTCGACGCC 1 



JAAAGATCGAGCTGAGTTCGAGTCAGTCCACCT 
GACAAGAACCCGTTGTTCTTAGACGAGCAGCT 



^CCTGCTGGACCGCACTCGCAAGCCGTTCCA 



TCTTCGACTTGGGTGGTGGCACTTTCGAI 

rGCCACTTCGGGTGACAACCACCT 
ATTGGCTGGTGGACAAGTTCAAGGGCACC/ 
TGCAGCGGCTGCGGGAAGCCGCCGAGAAGGC/ 

cgatca; 

GACCCG 

ttcgacccggatgcccgcggtgaocgatctgG t ^ cc ^ tct ^ g ^^ ttctca 

AGGGCGAGGTGAAAGACGTTCTGCTGCTTGATGTTAGCCCGCTGAGCCTGGGTATCGAGACCA 

^GGGCGGGGTG> 
\CTTTCACCACCC 
3AGATCGCCGCC 
GCGGGGGATTCC 
CGCCAAGGACA/ 
CCAAGGAAGAC/ 

GTCGCGAGGAG^o— — - GAAGGTACC TGAAGACACGCTGAACAAGGTTG 



- ^^^^^^^^^^^^^^qqqq^^^^GTGTCGGAGATCGATCACGTTGTGCTCGTGGGTGG 

JGGATGCCCGCGGTGACCGATC1 
CAAGGGCGTCAACCCCGATGAGGTTGTCGC 
»CGAGGTGAAAGACGTTCTGCTGCTT< 
sCGGGGTGATGACCAGGCTCATCGAC 



^GCGCAACACCACGATCCCCACCAAGCGGTCGGAG 

20 '%^^^7^^™™^™™™<*°^ r 



CCGCGCACAACAAGTTGCTCGGGTCCTTCGAGCTGACCGGCATCCCGCCGGCGCC 

c 

» COAAGGAAGA.^^^^-^^— 
TCAAAGAACAGC3TGAGGCCGAGGGTGGTT^^ G ^ ATAmcGGccATcAAGTcQ 



GAGATCGc^^^--- ■■■ --_ ^ a _ ^cecc^coGCATTGTGCACGTCAC 



ATG 1 



CCGCGGTGGCGGAAGCGAAGGCGGCAC 



GCTCTGGGGCAAGCGATCTACGAAGCAGCTCA 



^^^^^^ScLcCGGCGGCGA^CGGGCGGTGCCCACCXC 
M GGCTCGGCTGATGA^^GTGGA^GTOGAGGTGGTCGACGACGGCCGGGAGGCCAAGTG^ 

>R»035, grpE sdmufc.es DnaKATPasea«.v,» ^421707:42241,^24501 



36 



GACGGCTGCGGCCGATGCGGCGCACACCGAAGACAAGGTCGCCGAGCTGACCGCCGATCTGC 

58 



3NSDOCI0: <WO 01353UA1 I > 
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— - rr ^fceTTCGCCAACTACCGTAAGCGGGCGTTGCGCGATCAGCAGGCGGCC 
AAOGCGTGCAGSCCGACyCGCCMCTA GTGT ACTGGAGGATCTCGAGCGG 
GCTGACCGAGCCAAGGCCAGCGTTGTCAGCCAATTG^^ ^^^^^^^^^^ 

QCGCGCAAGCACGGCGATTTGGAGTCGGGTCCAC^^GT^^^^^^^^^^^^^^^^ 
GCGTTGACCGGGCTGGGTCTGGT.SGCG^CGGT^^ ^ 
5 GCACGAAGCGGTGCAACACGAGGGCGAC^^^^^^^GGG^^XAAG^ ^^^^^^^ 



CGACAi 



ATACCGCCGAAAACGATCAAGCTGACCA< 
AATCAGAACCGTCGGGCAGTTAA 



X3GGCAATAGCGCCGACACCTCGGGCGAACAGGCAG 



10 



^^^^^^^^q^^T^^^^^G^CTCGG^GTTGGCGCGCGACCTGCATCCGGACGCGAAC 
AGTCCTGAAGAGATCAAACGTGCCTATCto mcGG AGGCGCATAACGTGCTGTC 

15 ccgggcaaccgggccgccggcgaacgg^ca^^^^ 

ggatccggcgaagcgcaaggagtacgacg^^g^ttc TCGGTGGAGA c 

GCGGCCGTCGG^CGACAGCGGC^G^G^G^G^^^^^ 

GGCGCCGAGTTCAACCTCAACGACTTGTTCGACG(XG^^^^^ ^^^^ 
GGTGACTTGTTCGGTGGCTTGTTCG^^CGCG^^GCAGCGCCCGTC ^^^^^^^^^^^ 

20 CGGCAACGACCTGGAGACCGAGACCGAG^GGA^GTGG^ oaMaGOOa0 CCA 

GGCACCAGCCCAAAGGTGTGTCCCAC^GCAA c ggcTCGATCATCGAGCACCCC 
GCGTTCGGCTTCTCCGAGCCGTGCACCGACTGCCGAGG^^^^ ^ ^^^^^^ 

TGCGAGGAGTGCAAAGGCACGGGCGTGACC^OOG^GA^ ^ 
25 GCCCGGTGTCGAGGATGGGCAGCGCATC^G^CTAG^CG^^^^G^^^^^^^^^^^^^^^ 

GGCGCTCGCTCGGGGGATCTCTACGTGACGGTGCATOTGC CTTTQ GGCTCGACG 
GACGGCGACGACCTCACCGTCACCGTTCCGGTCAGC*n^^CCGAATT^^^ ^^^^^ 
CTGTCGGTGCCTACCCTGGACGGCACGGTCGGGGTCCGGGTGCCCAA^S 

GCGGATTCTGCGTGTGCGCGGA^GOGG^ 
M GCT^TO^^G^^GO^AG^^Gri^^^^^^'^^^^^^^^^^^*'^^^^^''^*^ 
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ACGCCGGCAGTGAC<*AA T CA T ^ 

TCAAGGCGGCGGCGGCCGCCAAGATCAn cagcCCGGAGGACTTCGAGAAA 
GAGGACGGCGTGGCGAACGAGATCAACGAGAAGC^A^^ 

ACCATCGAGGCGCTGGGCGCCGGTGAGCAW3GCAAA^^^^^^^^G^3^^^^^^^^ 

CGTGCATGGCGTCTACAAGCCCGGC^^^^^ 
ACAGGTGGCC^CGGCCAAGCTC^CTGCCGGCC^^CAAG 

ATG " C GACGG^^ 

^GGT^^CTCAAGAA)^^^^^®GTTCGATGAGCCAGCGGGTCGTTCAG<3CGTGCAATGACCTO 
CACTGCGCCGGAAAGTCCCTAACCCACTAA 

>Rv0405 >e ro61 AL 123 4 561 MT B H37 R V:«™ 99 3r. 
Pte6 . "!SL ^GGATAAGCTTCAAAAATGGTTTCGAGAGTACTTGTCCACGC 



CGTTTGGGATAATCCTAGCGCTAATGATTTGATTGATA- 



lCGCAGGGTCGGGGCAGCATAAACGAGC 
CGGTTGCGGTCATCGGAGTGGGC^GTC^A^TC^G^^^^^^^^^^^^^^^^^^^^^^^ 

rrTTTAAAGGATGTCGCGGGTTTCGATAATAGAT 



COrT^GAGAGAGTCATGGACACGCCGACAGGAA^^^^^^^^^ 



GGGACmCTGACCGAGAAGAAGTGTGCGATAACAGCCTATC^ q 
TGGAACTTTCGCGGAGTCGGGAGGCTTWTAAA^^^^^ 



ATATCCCGCCGGACGAGGCTCTGCGAATGGATCCGC ^ CTA0TOGCGTAT rC 



TTTG 



GGATAACACCGGCGGTTCTTCGAGTATTA 



iTTGCCAATAGAATCTCATACTTTCTCGATATTC 



CAG 



TCGAT, 



AACCATGGGGTGGGTTTAGGGAAGCGGG 
AAATCCGCCGACGGGATGGTACGCGGTG 



CATCTTGTCGCAGACAGGCTGCTGTCACGCGT 



AGGGATGCGGAGTTATCGTGCTGCAGCGCC 
\TTCTGACGGGTTCAGCGGTCAATC 
i^TG^CGCCAAATCCTAGTGCGCAAATTGGTGTTC^GAAAAT 

GCATGCAAGAGCGCTCGCGTCGATCCGCTGGAAATCGC 



35 TCAGTGATGCACGCCTTGAGGGCCGGCGGATATTAGCGAT 

AGGACGGTAAGTCCAACGGTATTA1 _ _ -JGCTACGTCGAGGCCCACGGGACCGG 
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AACGTCGTTAGGGGATAGGATCGAGGCGCA 



GGGA 



TCTGGGCCCCTGATGATCGGGAGCATCAA 



CGCCTTAGGCATGGTCTTTGGTCGCAAGAGACC 
GCCGAATATCGGCCATCTGGAAGGTGCGGC 



TGGCATCGCCGGATTGATCAAGGCG 



GTGTTGATGGTTGAGCGTGGCTCGCTGCTTCCGAGCGG 



GG< 



QGTTTACGGAGCCAAATCCAGCTATCCCA 



TTCACGGAATTGGGCCTGAGAGTTGTAGACGAA 



CTTCAGGAGTGGCCGGTGGTGGCG 



GGTCGGCCGCGCCGGGCTGGGGTGTCATCGTTCGGCTT 



TGGCGGCACCAATGCGCATGTGATTGTCGA 



GGAAGCTGGTTCGGTTGGGGCGGACACGGTTTC 



GGGCCGCGCGGATGTTGGCGGTTC 1 



CGGTGGTGGGGTGGTGGCGTGGGTGATTTCGGGGAAGA 



CGGCTTCGGCGTTGGCTGCTCAGGCGGGT( 



CGGTTGGGGCGGTATGTGCGGGCTCGGCCGGCG 



VCGCGGTCGGTGTTTGATCATCGGGCGG 



10 TGGTGGTCGGCCAGACTCGCGATGAGTT^^^ ^ A# ^rvar*r.fV3(3CAAGACGGCTTTTGTGTTTGC 
CCGGAGGCTGGGGTGGT< 

CGGTCAGGGCTCGCAGTG. xccGGCACCTGCGGTATCCGC TGCGCGATGT 

GAATTCGCCCAGCCGGCGCTGTTTGC 



^^^^^^^^^^ 

GGCCCTCGATGCTGTGGTGGACGAGTTGGAC 



CGA< 

GATCTGGGGGCAl 



.CGACCAAGATCTGTTGAATACCACC 1 



15 G 



GTGGAGGTGGCGCTGTATCGGCTGCTCATGT 



CGTGGGGGGTGCGGCCGGGTTTGGTGCTGG 



GGTAGCGCCGATGCTGGGGCACGATGTGAGCATCGCGG 
TGATCTCTGGTGCCCACGATGCGGTGAGCGCGATCGCTG 

ccc 

TQ CGCCG3TGCCAGTCGrrTCATC G AA^ TOcccAcGCTGTCc ^ GATcTCccOT ^ 



.CGTCGCCGGGGCGCTGTGTTTGCCGGATGCG 



iCCGGGCAGTTGGTGGCCGACGACTTCGCCTCAGCT 



CCCACGATCCCGGTCATTTCCAATGTGAi 

VTTACTGGGCCCGGCATATCCGGGCGGTGGT' 



GCGGTTTGGCGACAGTGTTCGTAGTGCCCAC 



» GCATC ^ CT ^ G ^ C ^ A ^^^^^^^^^^ n ^ CGTC jCGGGGATGGGCCTGGATTGGGCCT 
GGTCAGTGT< 

CGGTGTTTTCu— . — JCACCGCCGCCGGCCAGATCGGGGCT 



^^^^^^^^^^^ 



AGTTCTGGCTCGCACCAGCCCCATCGGTCAGCGACCCC 
^GCGATGGTGGTGCTGAACTCTTGGCGTCCTCC 



GA 



'CGGGTTTGCCGCCCGGCTGGCCGGTCGGTCG 

TCTTCGATCACCCCACC 
^^^^^^p^^^^^^GG^C©G<^TGCCGATGGTTGGAAGATGGTCGCCCTGGCGTCGAAT 

" ^g^^gXcaocggttcggaacaacgtatcgaagaacgtcgcactgctggca 

61 



AW-/^ol-/ l I ■ — • 

ga^cgatcacccca^^ 
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10 



TATTGATGTGGTAGGCGGCAGCTGCCGTTTTG CCCCTCGGAGT CGCACTCA 
GCCTATGCCCTCTGCTCCCATCTGTCG^^CAAG^W^AGC^G/^^ ^^^^^^^^^^ 

TCGATACATATCTGCCTAGTCAGATCGC^T^CA^AA GTTGAAT CGGTTAACTGCCA 
ACTGGGAAGGGCCTTTCCCGTGAAGTAAn^^^G^®^^*^ ^^^^^.^^.^^j 
CCCGACTCACC^GCAGCCACCTAT^GC^TCTT^GC^ 

GGCTCCGGTTCTTAACATCGTGGOGMGGfc^G^^^G^^CCG^^^^^^^^^^^^ 
— 

TGGGCTCGTTCGAGGGCCTCATCGGTAG 

i . . « famih/ to sea 522348:524531 MW:75315 

TGGGACGCGGTGTCGCTGACCGGCTCT^^CGCCGCG 
AGACAOCGCGGTCGGGACGGT^GCT^ACGTGACACT^ 

AGGCAOCGAGGTGATCGTCAG^CG^CACCG^C^ ^ TGCGG CAGGCCCTACTCGGCAAG 
20 GTTCAACGCTGGCCACCCAGTCGGTGGCGC^GTCA CCCCGGCAC A T CCACG 
GTGATGACCGTCGGTGACGCGGTCTCGCTGCT^^^^^^®*^ ^^^^^.^^.q^P 

TCGGCTGCCAGCCGCGCATrGGG^^^^^®^^^^^^^^^^ T ^^^QQj^^Qj G l3 G Q 
GTTACCGGCGTCGACCCCGACGGGCCGGTCAGCGTG TG/ ^ TCTCGAGTC CGG 
CGCTGGGGTCCCGGCCGCAATGGGTA^GT^^^^^^^^^^^^^^^^^^^^^^^qq^jj-^ 
25 AGATCCAGATCGAAGAGCT^^^CAGC^A^^^^ m ^ TCTGc 

agcttgccctcgatgagccgcacctactacagacctt GCGCGGTGTG cgacggccg 

TGGTGTCGGGTCCGGCCGGGGTGGGCAAGGCGAC^CTGGTGCG^^^ ^^^^^^^ 
AAGGTTGGTGACACTGGATGGTCCGGAGA^GG^^^^CTGGCC ^^^^^^^^^^^^^^^^ 
CGTGGC^CGGCAGTGCAGGCGGTT^^^^^^^^ 

M CGCCCT<*TGCCAGCCGCCGCC^C^ 

CGGTGGCCACCGCGGGTGT^GATCGCC^CC^ CCGCXGCCCG ACGCGGCCACC 

62 



BNSDOCIO: <WO 0135317A1 I » 
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ACGGCTGGTGTTCGTTGAACCGCCCGACGCTQ CGAGGTGGCAGCCGGA CTCGACGGTTA 



>RV<M36C pssA CF**— » ^ PhaM ^To? 0 ^ 5 2 2 531:525388 
RW P . i . i ,™ se ,MTBH37RV c52538e-524528. pssA SEQ ID NO.22 
MW31219 >emb|AL123456|MTB H37KV.c= ATACTGCCCAGC GCGATGAC 
ATGATCGGAAAGCCCCGCGGCAGGCG^^GGGTAAA^CTGCAG^^AC^^^^^^^^^^^^^^^ 

GGTGCTGTCCATTTGCGCGGGACTGACGGCW^^TGCG GGCCC GCA 
CGCGATGGCACTGATCGCCGCAGCGGCCATC^CG^CGGGCTCG^^ ^^^^ 



TCCTGGA 



JGCCCAGTCGCGGATGGGCGCAGAC 
GACCCGCGCTGGTGCTTTACGTGTCGAT 



GTTGTCGAAGTGGCCGGTCGGTTGGGTGG 
CGGCTGGCGCGGTACAACGCACTGCAGGACG 



GAGTGA' 

TCGTGCTGCTCTACGCGGTGTGCGTGGTAnA._ GCGC CGGCGGGCGCG 
ACGGAACCCAGC<^^CC1^C^^^GArcAATTC^C^T^G^^^^^^^^^^^^^^^^^^ 

GTTTCCATGATCGGCCTGCTAGCCCTCAAAATG AGCGGGATCO CGATGAAAAA 
GGTTCCTCAGCTTTTGGGTGACGGGAAf^TCGAn^^^TTGGTCA^^^^ ^^^^^^^^ 

^^^^^^^^^''^^^^^^^QQ^QTAC^^^nrG^TCTGGGTGATCATCATCGCCTACATGTGC 
CGCGGCGGCCGCAGTCCTGGCCCCCTACTT^^^^^^^^^^^^^^^^^^^^^^^^ 

CGAGCGGCCGGGCGCATCCCTACCGGCCGT 



CATATTCCTTTCGCGGTGCGCAGCCAGCGCT 

AAGCCCAAGCAACGGCGCGCGGTGCGGCGGG _ 
^GATGGCGCGGCTGGGCCTGCGCAAGCCGGGTCGACGGCTGTGA 



.RvOMOgroEL 260 KD chape™* 2 TB.se, 528606:530225 MW:56728 
> em b^123456|^H37RV : 528606-530228, a ™EUSE^ N O23 

ATOGCCAAG^^^^^^^^..^ 



0135317A1 I > 
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-^TrrrTCTCAAACGCGGCATCGAAAAGGCCGTGGAGAAG 
A CGTCGCGGCCGGCGCC«CCCGCTC^ 

GTCACCGAGACCCTGCTCAAGGGCGCCAAGGA CGCCGAGGCG ATGGACAAGGT 

OOOCAACG^^^^^ cggGGTACTTCGTGACCGACCGGGAGCGTCA 
6 CGAGGGTATGCGG^C^GGG^T ^^^GTCCACTGTCAAGGA 

^^^^^^^^^^^^^^^j^qt^^I^SGAGCCGGTAAGCCGCTGCTGATCATCGCCGAGGA 
TCTGCTGCCGCTGCTCGAGAAGGT^GAG CGCGGC ACCTTCAAGTCGGT 
CGTCGAGGGCGAGG^^^GTCCACCCT^©^^^^*^^^^^^^^^^^^^^^^^ 

^^^^^^^^^^^^.^^q^q^^^SGTCGGCCTGACGCTGGAGAACGCCGACCTGTCGC 
,0 CACCGGTGGTCAGGTGATCAGCGAAGAG ^^tcqtcgaggGCGCC 

t30taggcaag^^aa^^^^ ttc ^^ tcgaqaacag ^ 

^TGACACCGACGC^TCG^GAOGA ^^^oqtgtCGCGG 

^^^^^^^^^^^^^^Q^^^^^^rcG^ACTCAAGGAGCGCAAGCACCGCATCGAGGAT 
TGATCAAGGCCGGTGCCGCCACCGAGeil-u GGGGGTGTGAC GCT 

15 GCGGTTCGCAATGCCAAGG^^^^^ 

OTTGCAAGCGGCCCCGA^CTGGACGAGCTGAA^ CGCCTTCAACTCO GGGCTGGAGC 
ACATCGTGAAGGTGGCGCTGGAGGCCCCGCTG^ GACTGAA CGCTCAGACC 
CGGGCGTGGTGGCCGAGAAGGTGCGCAACCTG^GGOTG CCG TTCGGCG 
GGTGTCTACGAGGATCTGCTCG^ 

.RV0482 -nu.6TB.sec, 570537:571643 MW:3S522 

^123455,^37^:570537-571^^^^^^ 

a ATGAAAGGGAGCGGT^^^^^^^ CATCACTTGCACCAGCGCC 
GCGCCGCTGACCACTTTGCGTGTGGGCCCGATCGC CGG AGCTGACCGCCC 



>er 



G^=^-^r=G=GG== 

rGGATCGGTGCACGGGTGAGGTGCGTTGGGTATC 



GTCTGACACCATCACTCG^ 
CGCGCGCGACCTGCGCTTCGGCTATCGCACGAGC CAGCGCACCGCTGC 

64 
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15 
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T r..r GC GGCCACCAGAAAGGACGGTCCGGTCCCGCACTATCCCGCGCCCGACGG 

gSatcacactaaaacccgaacccgtgctgatcggctgcatgttotag 

>Rv0483 -TB aaq 571708:573060 MW:47859 

.. .,,.ccii„rn»n<!7RV-571706-573063. RV0483 SEQ ID NO:25 
>OT ^pl^^^TGT(^TGT^C^^CCGGTATCTTTGATACCCGTGAATAACTCCAGCACCCCCCA 

° TGGTCA "°°;°I°°^3° CQ TCTGGCGTT GACGGCCCTTGGGTTTGGGGTGTTGGCACC 

GAACGTTCTGGTCGCGTOt.^ TrccGCC GACGTGGTGCCGATCGCGCCGATCA 
CGCCTCGTCTGAC^CCGGC^ 

GCGTCGAGGTCGGTGACGGCTGGTTTCAGCGGGTC^^^^ 

CGGCAAGGCGGTTCCGGTGGCGGG 
TCAACGCGGGATTCCAGCTCGCCGACGGCCA 



35 



oTCGCCGGGGCATACAGCCGGGATCGCACCATCTAi 
ACGACCTACACCTGGAGCGGTTCGGCCGTCGGCCATGA 



G 



CAAGTTCACCACCGTGGCACCCGTCAAGACGA 
CCGTCGGGATCGCGGCGCCGGTGATTATTCA 



GAL 

CGTCGAGCGGGCACTAACCGTGACCACCGACCC 



GTTCGATTCACCGATCAGCGACAAGGCCGC 
GCCTGTCGAGGGCGGCTGGGCCTGGCTGC 



CCGACGAGGCGCAGGGCG CT= GAC T= 

^^^^^p^^qTL^^q^^qq/^^^CGGCGTCATCATGGACTTCCCGTGCAGCTACGGCGAGGCC 
CGCATCCAAGTCGTCACCGATGCA. ACGT CGTCACCGAGAAAT ACTCGG ACTTC 



GACTTGGGGCGCAACGTCACCCGCAACGGCATCC/ 

TACATGTCCAACCCGGCCGCCGGTTACAGCCATATCCA^^^^^^^^^^^^^^^ 



.CGAACGTTGGGCGGTGCGGATTTCC 



25 AACAACGGCGAGTTCATCCATGCCAACCCTATGA. 



ACCAACGGCTGTATCAACCTGTCGACGGAGA^G^C<^ACAGTAC^^^^G/^C^^G^TCTAC^ 

JGTCGGCGCTACCGCCACCGGCGGCCAA 



GGTGACCCGGTTGAGGTGACCGGCAGTTCGATCCAC 

GACTGGGCGta , ■ — CCC CGGTCACGCCGTCGGATGCCCCCACCCCGT 



JGGTGGACTGGGACACCTGGGTGTCGA" 
ACCGGCGGCGACGCAAATCCCGGTCACCG 
30 CCGGCACACCCACGACTACTAACGGACCGGGTGGGTAG 



>Rv0489 gpm phosphorate mutase , TB.se, 578424:579170 MW.27217 
C^^TCACCACCGCGCATCTGGCGTTGGACAGCGCCGATCGGCTCTGGATTCCCGTGCGGCG 



BNSDOCID. <WO 0.35317A1 I > 
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TAOCTGGCGGCTCAACGAACC3CWCTACG ^qctatgaCACGCCGCCGCCGC 
GGCCOGCTATGGCGAAGAGCAGTTCATGGCCTGG^GC CGGCG GT 
CGATCGAGCGGGGCAGTCAGTTCAGCCAGGACGCCG^^CCTCG^^^ ^^^^^^^^^^^^ 
GGCCCGCTCACCGAATGTCTGGCTGACGT^T^CGGTTTrr GGCAACTCGTTGCGCG 
S TCGTTGGCGACTTGCGGGTCGGCAAGAC^^T^^^ ct^TCCCGACCG 

^^^^^^^^^^ 

10 >Rv0490 senX 3sensor h«ne Knase TB.se, 579347:580576 MW:44794 

15 CGCGGTGGTGGACACCOATCGCGACGTTGTOTACCTCAAC^CGG^ ^ 

-GCGCG^G^-"-^^^,^^ 
GAAGACGTCGAGTTCGACCTGTCGCUvsov»o ^ 0 ^ nnTTrGCC GTGGTGTTCGTGCA 



20 



^^*^^^^^^^^^^^^^^!^^«^rcGGC^AGAAGGT^^^CATT^^GGCCAACCGGCTCGGTG 



ACGACTCCGAAACCGTTCGGCGGTTCGCCGAGAA 



ACATGGTCGCCGAGTTGATCGAGCTATCCCGGCTWDAGGG^G^XG^G^G^CT^D^^A/^ATGA 

CCGACGTCGACGTCGATACGATTGTGTCGGAAGCGMTTCA^G^CATAAG^^^^^^^^^^ 
ACGCCGACATCGAAGTCCGCACCGACGCGCCCAGCAATCTGCGGGTG<3T^^ 

26 TGCTGGTTACC^ACTGGCAAACOTGGTTTCGAATG^ATTG^^^^^ 
GGTGTCGATCAGCCGTCGCCGTCGCGGTGCCAACATCGW^^CGCCGTCAI^^^^^^^^^^^ 

GCGGTCAAACAGGTCACAACGAGAGGAAGAGCTGAGCCGATGA 

.R.0500 P ™C — ^If 0965 MW;3 ° 172 

^AUa^M^ 



35 



BNSDOC1D: <WO 013S317A1 I > 




10 



PCTAJS00/31152 

WO 01/35317 

GAACGCGACGTTCGTCGTCGTCGCGGTCAAACCAGCCGACGTCGAGCCGGTGATCQCGGATCT 
GGCGAACGCGACTGCGGCGGCCGAAAACGACAGTGCTGAGCAGGTGTTCGTCACCGTGGTAG 
CGGGCATCACGATCGCGTATTTCGAATCCAAGCTACCGGCTGGGACGCCAGTGGTGCG'TGCGA 
TGCCGAACGCGGCGGCATTGGTGGGAGCGGGGGTTACAGCGCTGGCCAAAGGCCGCTTTGTC 
ACCCCGCAACAGCTTGAGGAGGTCTCGGCCTTGTTCGACGCGGTCGGCC^CGTGCTGACCOT 
CCGGAATCGCAGTTGGACGCGGTGACCGCGGTGTCCGGCTCGGGTCCGGCCTATTTCTTTCTG 
OTGGTGGAGGCCCTGGTGGATGCCGGAGTCGGGGTGGGCTTGAGCCGTCAGGTGGCCACCGA 
TCTCGGCGCGCA6ACAATGGCTGGCTCAGCGGCGATGCTGCTGGAGCGGATGGAGCAAGACC 
AGGGTGGCGCCAATGGCGAGCTGATGGGGCTGCGCGTGGACCTTACCGCATCACGGCTGCGC 
GCCGCGGTTACCTCGCCGGGCGGTACGACCGCCGCTGCGCTGCGGGAACTCGAACGCGGCG 
GGTTTCGGATGGCTGTCGACGCGGCGGTTCAAGTOGCCAAAAGCCGCTCTGAGCAGCTCAGAA 

' TTACACCGGAATGA 
>Rv0528 -TB seq 618303:619889 MW:57132 

>emb|AL123456|MTBH37RV:618303-619S92. Rv0528 SEQ ID NO:29 r ^ r ^~— 

15 AT g4ggggtggttgacgtcgatgggcaccgcggtggtgctggtgtttttgctcgcggtggct 
^cataoccggggccctgctgccgcagcgtggcctcaacgccgccaaggtggacgactacct 
. Scgg^gccactcatcggtccgtggctggacgagctgcaggccttcgacgtgttctccag 

CTn^TG^nTCACCGCCATCTACGTGCTGCTGTTCGTGTCCCTCGTCGGCTGTCTGGCCCCGCGG 

a^t^Tgcacgcccgcagcctgcgggctacaccggtcgccgccccgcgcaacctggcccg 

GCTG^CCCAAGCACGCCCACGCCCGGCTGGCCGGCGAGCCCGCC^CCCTGGCCGCCACCMrCA 
CGGGCCGGCTGCGCGGCTGGCGCAGCATCACCGGGCAACAAGGCGACAGCGTGGAAGTCTCC 

ScgaLgggctacctgcgcgagttosggaacctggtgttccacttcgcggtgctgggtctg 

Sggt^ggtggcggtcgggaaggtgttgggctaogagggcaaggtgatcgtgatagccga 

cgg^gg^^ccggtttttgttcggcgtcgccggccgcgttcgactcgtttcjgcgcc^^caacac 

!5 cGTCGACGGCACGTCGTTGCAaXGATCTGTGTGCGGGTCAACAACTTCCAAGCGCACTACCT 

gStcggggcaggccacctcgttcgccgcggaca^ 

CTGACCTGATCGCCAACAGCTGGCGGCCCTACCGGCTGCAGGTCAATCACCCGCTGCGGGTI^S 
GCGGCGACCGGGTGTACCTGCAGGGCCACGGCTATGCGCCCACCTTCACCGTGACGTTCCCG 
GACGGGCAGACCCGCACGTCGACCGTGCAGTGGCGACCCGACAACCCGCAGACCCTGC^^^^^ 
M GGCGGGCGTCGTGCGCATCGACCCGCCGGCCGGCAGCTACCCCAACCCCGACGAGCGTCGCA 
^rca^TScCATCCAGGGCCTGCTGGCTCCCACCGAGCAGCTCGACGGGAGGCTGCTGT 
CGT^GC^nTT^CCCGCGCTCAATGCCCCGGCGGTGGCCATCGACATCTACCGCGGCGACi^^G 
GCCTGGACAGCGGGCGGCCCCAGTCGTTGTTCACCCTGGACCACCGGCTGATCGAGCAGGGC 
CGGCTGGTCAAGGAAAAGCGGGTCAACCTGCGCGCCGGTCAGCAAGTCCGCATCGACCAAGG 
* CCCGGCGGCCGGCACGGTGGTCCGGTTCGACGGCGCGGTGCCGTTCGTCAACCTGCAGGTCT 
CCCACGACCCCGGCCAGTCCTGGGTGCTGGTCTTCGCAATCACGATGATGGCGGGACTGCTGG 
TG^CGCTGCTGGTGCGCAGGCGCCGGGTGTGGGCGCGGATCACGGCGACGACCGCGGGTACG 

67 



20 



BNSOOCID:<WO 013S317A1 I > 



10 



15 



20 



PCTAJS00/31152 

WO 01/35317 
CCGCAGGGACCGGAAGGGACGTCGATTGA 

GGCGCGAATCCGCCGCCGAGCGto TCGATGTCGT TGTCGTTCTCTGACCCTCGTT 

====== 

TCAAGGCGCTGGGCTGGACCAGCG^^^^^^®^^^^^^^^^ ^^^^^^^^q^ 
GATCGACGCTGGAGW^GGACAM^C^^^^=^^^^^^^^^^^^^^^^^^^ 

AGCTGCGTCCGGGCGAGCC^^^^^^^^ ^^^^^^^^^Q^^QQQQ 
TCAAGGAGAAGCGCTACGAOCTG^CCCGCGTC^ 

TGCATGTCGGCGAGCCCAT^TOGT^C CCG GGCGGCGTCGAGGTGC 
AATATCTGGTCCGCTTGCACGAGGGT^GACCA^ 

^^^®®^*^^**^*^^^^^^I^qq^^GGAGC^^QGTCCGGGAGCGGATGAC 
ATCOAAAAGOAGATCOGG^^^O^GGAG^ 

CACCCAGGACGTGGAfiGCGA^^^^CG^AG^^G^ ^^^^^^^^^^^^^^^ 
C^TCAAGGAGTTCTTCG^O^^G^^^^^^^^ 
GGGGTTGACCCACAAGCGCCGACTGTCG<X^TGGG gTGCCCGATCGAAACC 
GCGGGCTGGAGGTCCGCGACGT^ 

GGGTTCATOGAAA^^^ ggtggCACAGGCCAATTCGCCGATCGATGCGGA 

ctgacccxcgacgaggaggaccgccacgtg ggcgaggtggagtacgtgc 
cggtcgcttcgtcgagccgcgcgtgctc^tccgccg^ 

cctcgtctgaggtgga^tgga™ 
5 tgattcccttcctggagc/^gacgamcca^^^tg^^^^^^^^^^^^^^^^^ 



BNSDOCID: <WO 0135317A1 I > 



PCT/US00/31152 

WO 01/35317 

rnrArTACATCACTGTGATGCACGACAACGGCACCCGGCGTACCTACCGGATGCGCAAGTTTG 
CCCG^T^C^^CA^GCACTTGCGCCAACCAGTGCCCCATCGTGGACGCGGGCGACCGAGTC 

GAGG( 

GAACCTGC , w . — ■ CCTCGATCCACATCGAGGAGCATGAGATCGATGC 



5 



CAACCGCCTGGTCGAAGAGGACGTGCTCAC 



5CGACATCCCGAACATCTCCGACGAGGT 



GCTCGCCGACCTGGATGAGCGGGG 1 



CATCGTGCGCATCGGTGCCGAGGTTCGCGACGGGGACA 
W3ACCGAGCTGACGCCGGAGGAGCGGCTGCTG 



TCGCGACACCAAGCTGGGTGCGGAGGAGATCACCCG 1 

GCTCC 
TCCTC 
CGTG 

10 CGAAk;, , ,— • - • ~ - ' — GCTCAGAAACGCAAGATCTCCGACGGTGACAA 



TCCTGGTCGGCAAGGTCACCCCGAAGGGTGAC 

1 T^^G^W^^roAT^G^^JTOGGCTGTrn^CCGCGAGGACGAGGACGAGTTGCCGGC 



CGTGCCATCTTCGGTGAGAAGGCCCGCGAGGTGCGCGACACTTCGCTGAAGGTGCCGCACGG 



CGGTGTCAACGAGCTGGTGCGTGTGTATGTGC 
^G~GAC^^^^^^ 



^CGGCAACAAGGGCGTGATCGGCAAGATCCTGCCGGTTGAGGACATGC 



^^^^^^q^^T^^C^ACTOCGACGOTGACGTGCTGGTCGACGCCGACGGCAAGGCCA 



20 



TGATGACCCAGGAGC^^ oct ^^ a ^^ a ^ ctotg ^ tcaaq 
T CCGATGACACCGTCGGCCGCGTCAAGGTGTACC_^^ cT ^ AGTcQCTGT ^ CTcAAc 



GAGTGCTGGGGOA^GG f GTAC = G^^ TMT ^ TG ^ T ^ 



CTCGAGG^^TAT^AGTOA(iGTG^MGCGATCGAACTGCGCGAAG6TGAGGACGAGGACCT 
25 GGAGCGGGCCGCGGCCAACCTGGG 



30 



35 



CGGCCGCGTCAAGGTGTACGAGGCGAT 
CCGGGCATCCCCGAGTCGTTCAAGGTGCTGCTC 

CGGCGATCGAy 

AATCAATCTGTCCCGCAACGAATCCGCAAGTGTCGAGGA 

/ v J x^i* - — 

TCTTGCGTAA 

> R v066 8 rpoC ^V.Ubunn rtR NAp<^. ra s.TB W T6^:7«731 5 MVV:,46740 
.en^AU^IMTBH^^ 

CTCTCC^G^^^At^^CGAAATGGCGGTGGAGCGCAAGGCCGTCGAAGACCAGCGCGACGG 

AO 



BNSDOC10:<WO 0135317A1 I > 



PCTAJSOO/31152 

WO 01/35317 

GCCAAGGCCGATGCGCGGCGCAAGG - q-j-^qq^qqaC ATCTGG AGCACTTTC ACCAAGCTGG 



GTGA< 
CGCO 



CCGCGCGCAGCGTGAGCTGGACCGC 



CAAGCAGCTGATCGTCGACGAAAACCTCTA 



GCGCGAACTCGTCGACCGCTACGGCGAGT 



5AAGCTGATCGAGAACTTCGACATCGACG 



ACTTCACCGGTGCCATGGGCGCGGAGTCGATCCAG/ 
CTCAAGCGGCTGAAGGTGGTTGCGG^^ 

CTCGACGCCGTCCCGGTGATCGCGCCGGAGCTGCGCCC^ 



GTTCG 1 



CCACGTCCGACTTGAACGACCTGTACCGCAGG 



10 



TGCTCAAGGGCAAGCAGGGCCGGTTCCGGCAGAACCTGC 
GCCGGTCGGTCATCGTGGTCGGCCCGCAGCTCAAGCTGC 



ATCCGTGGACGCGCTGTTCGACAATGGCCG 
GTCCGCTCAAGTCGCTTTCCGATC" 



TCG< 



ACCAG 



GCAAGCGTGTCGACTACTCGG' 



TGCGGTCTGCCCAAGCTGATGGCGCTi 



GGAGCTGTTCAAGCCGTTCGTGATGAAGCGGC 
AGCTGCACCCGTTGGTGTGTGAGGCGTTCAATGCCGAC^CGA^^^^^^^^^^^^^^^ 



ACCTGCCTTTGAGCGCCGAAGCGCAGGCGGAGGCTCGCA^ GACC GGGCTGT 
20 CCTGTCGCCGGCATCTGGGCGTCCGT^3GCCATG^^G^GGCTG^ ^^^^^^^^^^^ 
ACTACCTGACCACCGAGGTCCCCGC^GAOACCGG^AATAC^G 

CCGGAGACTGGTGTCTACTCTTCGCCGGCCGAA^^G/^C/^GGCGGCCGACC ^^^^^ 
AGCGTGCGGGCCAAGATCAAGGTGCGGCTGACCO^C^^^GCCGCCGG^ ^^^^ 

GAAGGTGCAGGCCGCCATCATCAACGACCTGGCCGAC-- 

GACCGTCGACAAGu , wv«~~ ^qgagatCCTCGACCACTACGAGGAGCGC 



25 GCCGGG ^T^TCATCAACGACCTGGCCGAGCGTTACCCGATGATCGTGGTCGCXCA 

^tc^agggcggcttotactgggcoaccogcaggggcgtgacggtgt 



CGATGGCCGACGTGCTCGTGCCGC^G(^^AA^^^^^^^^^^^^^^^^^^^ 

GCGGACAAGGTCGAAAAGCAGTTCCAGCG™ caqgcgtTGCGGGAGCACTACC 
,„ GCTGGTGGAGATTTGGAAGGAAGCCACCGA^GWSGTCG^^CA^^C^ ^^^^^^ 

ccgacgacaacccgatcatcaccatcg^^^^^^ tcc ^ ccg 



CGCTGGCCGGTATGAAGGGCCTGGTGAC 



GAA<_ 

GTCAAGTCCTCCTTCCGTGAGGGCCTGACCGTG 



G 



AAAGGGCTTGGCGGACACCGCGTTGCGCA* 



CTGGAGTACTTCATCAACACCCACGGCGCTC 
CCGCCGACTCCGGCTACCTGACCCGACGTCTG 



. 3 T GGACGTG T CCCAGGACGTGA T CGTG= = 

^^^gSa^^cggcaacg^cgaoc 

70 



BNSOOCID: <WO 0135317A1 I > 



PCT/USOO/31152 

WO 01/35317 

GGTGCGTTCGGTGCTGACGTGTGCO 7GAAGCCGTCGGCATCGTGGCCGCCCAGTCC 



JGGCCACCGGCAAGCTGGTCGACATCGGTC 

C 

GCTCGAGGACGGCGAGCGGTTCTACAAGATC 



====== 

CGCCGATCGCCGACGTCACCGGCCGGGTTCG 



^^^^^^^^^ 

CCGCGAGGTGCAGATACAl 



GA" 



JCCACGACAAGCACATCGAGGTGA 



CTCGGGCTCGACGGAGTTTTTGCCTGGC 



CCTGGTTCGCGAGGTCCAGGAGGTCTACCGCGCCCAAGGTGTGTC 
TCGTTCGCCAGATGCTGCGCCGGGTGACCATCATCGA 
TCGCTGATCGACCGCGCGGAGTTCGAGGCAGAGAA 



CCGCCGAGTGGTGGCCGAGGGCGGT 



GAGCCCGCGGCCGGCCGTCCGGTGCTGATGGGCATC 



ACGAAGGCGTCGCTGGCCACCGACTCGTGGC- 



TGTCGGCGGCGTCGTTCCAGGAGACCACTCG 



GCGATAAGCTCAACGGTCTGAAGGAAAACGT 
GTATCAACCGCTACCGCAACATCGCGGTGCA 



CGTGCTGACCGATGCGGCGATCAACTGCCGCAC 
GATCATCGGCAAGCTGATCCC^ 

CCGCTAG 

>Rv071 1 atsA TB.seq 806333:808693 MW.86216 

====== 

•71 



0135317A1 I > 



pCTflJSOO/31151 

WO 01/35317 

TCCGACAACGGCGCCAGCGGC^ ^ CATGAAGCTCTrcGACCA cCTCGGTGGCCGGCA 



CAACGGCTACATCGACACCGTCG^- ( ^ t ^^^ ccta ^ gctg „ c ^ 
CGCTACGCCTCGCATGAAGGCG ^ GG fcAATGTCAGCGACATCACGCCCACCGTCTAC 



***^^^^^^^«!^^^i^^^TTGCCGACCCGGCAATCATCTCCTGGCCCAAC<30CA^ 



GCCGCACACGGTGAAATCCGCGACAACTAC 



GACCTGTTGGGCATGACACCGCCGGGGACCGTCAAGGGGATTCCGCAGAAACCGATGGACGG 

CCTTGCCGAC 
iCGGGATCTG( 
3GAATTTCAAC 
\CCTGGCCGC 
JCAAGTACAAC 
CTTACCTGGT( 
SCGCGGCCG1 

VCGCTTGCACTACGTCTACAACTTCCTCGGTGAGCGC 



« -_ AT . rrGGCCCTTGCCGACCCGGCCGCCGAGACCGGCAAGACCACCCAGTTCTA 

° G ™^gg^acccg^ 

GACCGC 
SCATCCC 
iCTGCCG 
JGAACGA 
GATTCG( 

atgtgaccatcgataccaccggcgccgagggcgtgctgttcaagcacggcggcgcccatggc 



cacgoccgccggotggtggaaty^^ t ^ c ^ ot ^ gctcaaggcgct 

acaacgggctgccgctggccgatctgaacctgctggaaac 



^C^^^^^^OO GQA AGACAT CT ACTCGGGGTTGGTTAT 
CAGCAGCTGGTCAGCTCGTCGGGTCCGGTCCC gg y Gggg qaTCTTGAGCTGTTCTTCGAC 



TTGCGGACCGGAACCGTGCCCAACAGTCACAC^GCC^^^^^^^^^^^^^^^^^ 

:gc 

GCGTTGACCGGCGGTACCATCACCCAGGTCAC 



GAGAACCTGGTCGGCG^CCTG^CCAA^yTG^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 

GCCGCTATCAGCGTTGGCCGCAACGGC J Q GG -fCGACGTGTCAGGCCGACCGTTCGAAGAT 

GTGGAATCCGATCTrGCGCTTGCTTTTTCGCGTGACTGA 

>RV076 4C - «— « P«0 TB.se, B^^S MW:S087 9 

ATGAGCGCTGTTGCACTACCCCGGGT^^ 

TTCCGCAGCGATCCGATCGGGCTGATGC^AC^GGT^CG^G^C^^^^^^^^^^^ 

3CAGG 

CGAGGGCGTGGTGTTCGACGCCAGCCCGL - cCATCGAAGATCAAGTCCGACGGATGATCG 



TTCCAGCTGGCCGGGAAGCAGGTCGTGCTU^ rrcATGACGCCGATCTTCGG 
, CGGGCGGGCGACGACGACCTGGA^^G^CCAAGGC^^^^^^^^^^^^^^^^ 

TACGCGGCGAGGAGATGAAGGGCCACGCT6CCAC 
CCGACTGGGGTGAGGCCGGCGAGATGGATCTGC1 

CCTCCTCGGCCTGCCTGATCGGCAAGAAGTTCC^CGA^C^^^^^^^^^^^^ 



JCGACTGGGGTGAGGCCGGCGAGATCGATCTGCTGGACTTCTTCGCCGAGCTGACCATCTACA^ 



^AGCTCGACGGGCGATTCGCCAAGC 



iTATCACGAGTTGGAGCGCGGCACCGACCCACTAGO 



72 



<WO 0135317A1 I > 



PCT/US00/31152 

WO 01/35317 



CCGTCAAGGCTGAGACCGGCACTi 



CCCCGGTTCTCGGCCGACGAGATCACCGGCATGTTCATCT 



CGATGATGTTCGCCGGCCATCACACCAG 



CTCGGGTACGGCTTCGTGGACGCTGATCGAGTTGA 
mCGCCATCGCGACGCCTACGCGGCCGTGATCGACGAACTCGACGAGCTGTACGGCGACGGC 
CGATC^TGAGT^^ATGCGCTGCGCCAGATTCCGCAGCTGGAAAACGTGCTGAAAGAGACG 

gg^^cc^g^^catgagggcgatctggtggcggcctccccggcgatctccaaccggatocc^ 
gaaga^tk^ccgatccccacgacttcgtgccagcacgatacgagcagccgcgccaggaagat 

C^CTCAACCGC^^ 
G^CGCCAT^ 

10 GCG^ACCG^AGAAAGCTATCGTAACGACCATTCGAAGATGGTGGTGCAGTTGGCCCAGCCC 
GCTTGCGTGCGCTACCGCCGGCGAACGGGAGTTTAA 

>Rv0861c - DNA heficase TB.seq 958524:960149 MW.59773 

u.ai <^<!fiiMTBH37RVc960149-958521, Rv0861cSEQ ID NO:34 



15 



GTGCAGTCCGATAAGAC 



^CCCGAACATGTCCACACCTACCGCATCACA 



GCCATCGCGCCGTTCGCCGAGCTGGAACGTGCAl 
GOTCAQTTVKCTC^ 



CCGCTGGCACTGTGGAATGCTCGCGCCGCCGGCCATGATGCCGAGCAAGTCGTCGACGCGCT 



n^rrTACGGACGACTGCAGTTGGTCAAGAACCCGGCCCATGGCCTGACGCTGGTGAGCCTGGA 
S^GG^ 

CGATGACGACACCGTCGTCGTCCACCCCA 
GATCGG 

GCACCAGGAGGGCTGGCAGCTGCGCGATTA 



i^VVnw w>~» 

GCCGGCTACGTCGATGGTGAAGCGCACCCGATCAGCCT 



20 CC^-o , «o , . ~ \GCGAACGCGGCCGGGTCAAGCAGCTGCTGCTCAA 

OATCGGTTGGCCCGCAGAGGATCTCGCC ^^^^^^^^^^ 

GCGGCTCCGGGGTGGTGGTGCTGCCATGTGGGGCCGGCAAGACGCTGGTCGGTGCGGCCGC 
„ AAn^CCAAAGCCGGCGCGACGACGTTGATCCTGGTCACCAATATCGTCGCGGCCCGGCAATG 

GAAACGA 

GAGTACCGCCAT^rGG^^CTGTTCGACAGCCGCGACTGGGGGCTCATCATCTATGACGAGGTG 

3CTGTTGCCGGCACCGGTCTTCCGGATGACO 
3ACCGCCACGTTGATCCGTGAAGACGGACGC 
-.-^ * mrftrrftTGG AAGGACATTGAG 

\GCGGATGATGTACGCCACCGCCGAACCCG 



GAGCTGGTCGCGCGCACCTCGCTCACCGAGAATGAGATCGGCGAATTCTCGGGAGA 
GAAATCCGACCTGTCACCATCTCGACATACCAGATGATCACCCGCCGCACTAAGGGC 



30 



Q7GTTGCCGGCACCGGTCTTCCGGATGACCGCTGACCTGCAGTCCAAACGGCGGCTGGGG 

( 

AAGCGCTATGACGCGCCGTGGAAGGACATTGAGGCGCAGGGCTGGATCGCGCCAGCTGAGTG 
CGTGGAAGTCCGGGTCACGATGACCGACAGCGAC 

^G^CGCTACCGGATCTGCTCGACGGTGCACACCAAAATTGCTGTGGTCAAGTCGATTCTGGC 



GAAGCACCCGGATGAGCAGACCCTGGTCATCGGA 



GCGTACTTGGATCAGCTCGACGAGCTGGG 



GACAAGGACCAGCGAACGCGAGGCACTGT 



rGCCGAGCTCGGCGCTCCGGTGATTCAGGGGTCC 
35 CGCCGAGCTC^^ "^TCGTGGTGTCCAAGGTGGCTAACTTCTCCA 



TCGACGCCTTCCGCCGCGGCGAGGTCGCTACGC 
TCGACTTGCCGGAAGCCGCCGTGGCGGTACA 



GGTTTCGGGAACATTCGGCTCACGCCAGGAAG 
73 



BNSOOCIO: <WO 013S317A1 I > 



PCT/US00/311S2 

WO 01/35317 



:gacccaaggccgacgggggcggtgccatcttctac 

rGCCGAGTACGCCGCACACCGGCAGCGGTTTTTA< 

gagcagggctacgg™catcatccgcgacgccgacgacctgctgggcccggcaatttag 



^G^ 



>Rv0904c accD3 TB.seq 1006694:1008178 MW.51741 

>emb|AL123456|MTBH37RV:c1008178-1006691, accD3 SEQ ID NO:35 „_ Ar .„ 
C^TGACTCGTATCACGACCGACCAACTGCGGCACGCGGTGCTAGACCGGGGATCTTTCGTCAGC 

TGGGATAC 



WSCGAGCCGCTGQCGGTGCCGGTAGCCGACTCCTATGCGCGGGAGCTGGCCOCCGC 
TCGGGCGGCCACCGGCGCGGACGAATCGGTGCAGACCGGTGAGGGACGCGTATTCGGGCGG 



CGGGTGGCCGTGGTGGCCTGTGAGTTCGACTTCCTGGGCGGCTCGATTGGGGTGGCAGCGGC 
CGAACGGAT^^^CCGCCGTCGAGCGGGCGACCGCCGAGCGGCTGCCGCTACTGGCGTCAC 



CAAGCT^GGGAG^^^C^G^ATGCAAGAAGGC^CGGTCGCGTTTCTGCAGATGGTGAAGATCG 

ctomgccItccagctgcacaaccaggc^ 

^A^GGTG^^^^GGCGTCGTGGGGCTCGCTGGGGCATCTCACCGTCGCCGAGCCGGGC 
^CTGATC^GcmCTGGGACCACGGGTOTATGAGTT^TCTATGGCGACCCCTTCCCATCCG 

^GT^CGCCGAGAATCTACGGCGGCATGGGATCATCGACGGCG^^ 
GGCTACGACCGATGCTGGATCGTGCGTTGACGGTGCTCATCGACGCTCCCGAACCGCTTCCGG 
C^^^^AGACGCCCGCGCCCGTACCCGATGTGCCCACGTGGGACTCGGTGGTGGCATCGCGC 



15 



20 GTCA< 



ACCGGCCGGGCGTCAGGCAGCTACTGCGACACGGCGCCACCGACCGGGTGTTGTT 
AAGCGGCGACCACG 

AACCCACGGTGGTCCTCGGCCAGCAAAGGGCAGTAGGCGGCGGGGGAAGCACTGTCGGGCCC 



CQQCCSGN " ACGCTGCTGGCGCTGGCCCGCTTTGGCGGCC 

AGTAGGCGGCGGGGGAAGCACTGTCGGGCCC 
GCGCTCGCCGCCGAGCTGTGCCTGCCGCTGGT 



GGAACCGATCAAGGCGAAGCGGCGACC; 



GATCGCG^ATTGCCTGGCCGAGCTCGTCACGCTGGATACCCCGACCGTGTCGATCCTGC 
3GGCCGGCGCTGGCG 

ACTCCACGGCTGGCTGGCGCCCTTGCCTCCCGAAGGAGCCAGCGCGATCGTGTTCCGAGACAC 

jAAGGCATCCGGTCGGCCGACCTACTGAAGTCGGG 



25 



TCGACACCATCGTGCCGGAGTACCCCGACGCCGCAGACGAGCCGATCGAGTTCGCCCT 
ACGACTGTCGA^CG^ATCGCCGCCGAAGTGCACGCGTTACGGAAGATACCGGCCCCGGAACG 



TGCTCATGCCGCCGAACTCGCTGCCGCCG 
GATTG 

30 CCTCGCGACTCGGTOCAACGCTACCGCCGGATCGGGTTGCCCCGCGACTAA 
>Rv0983 -TB.seq 1099064:1100455 MW:46454 



35 



>Rv09B3 - I D.seq ■ ■ — • — 

>emblAL123456|MTBH37RV:1099064-1100458. Rv0983 SEQ ID NO:36 
ATGG^A^GTTGGCCCGAGTAGTGGGCCTAGTACAGGAAGAGCAACCTAGCGACATGACGAAT 

GCAAACGTA^AGCCAGCAGTTCGACTGGCGTTACCC 

GTACCGTCAACCCTACGAGGCGTTGGGTGGTACCCGGCCGGGTCTGATACCTGGCGTGATTCC 



BNSDODD: <WO 0135317A1 I > 



PCTAJS00/31152 

WO 01/35317 

^TrrTGGGATGGTVCGCCAACGCCCTCGTGCAGGCATOTTGGCCATCGG 

GACCATGACGCCCCCTCCTQGGATGG™. ^gccQCATcccTGGTCGGGT 
CGCGGTGACGATAGCGGTGGT©TCCGC^GG^V^^G^G^TOCGG^^^^^^^^^^^^ 

TCAACCGGGCACCCGCCGGCCCCAGCGGCG^CCC/^n^Gp^ ^^^^^^^^^^^^ 

TCGTCATGTTGGAAACCGA^TG^^OTCGG^AG^ ^ TcccCT ^ 

^'^^^^^^^^^^^^^^^^^^^O^QyCTGACGGGCGGACCGCACCCTTCACGGTGGTG 
AGTCCGCCGCCGAAAACGACGGTAACCTTCTCTGAC^ gCGGGCTCACCCCG 
GGGGCTGACCCCACCAGTGATATCGCCGTCGTCCGTGTT^G gATCGGGTCGCCG 



^CGGGGATCGTCAGCGCTCTCAACCGTCCAGTGTCGACG 

10 ^go^gIo^^ 

CCCCGGTAACTCCGGGGGCGCGCTGGTGAACATG/ «rtf~Tr.f5ATr:GGTCTCGGTnTGC 



15 



20 



ATCTCCCTGGGTTCCTCCTCGGACCTGAGGGTCGGTC^ 
-TCGGTTTGGAGGGCACCGTGACCAC 

GGCAAGGCGGAGCAGTGA 

«. • v^n-i tr sea 1127087:1127878 MW.29066 
>Rv1008 - Similar to E.colt protein YcfH TB.seq 11^u*w.i 

S^^^CGGCGrGA^^^^^- 
25 GAGTCCGCGCGCTGG^^CACCCGCGCG^^^GAM^^G^TCGG^CGMSTCT^^^^^^^^^^^ 

CAAGCGGACCGGTAAACCGCTGATGAT^^ 

====== 

3 5 ISIgcaacgctcgccgagcttatgggctagggtggatgcgccaatga 



75 



BNSOOCID:<WO 0135317A1 1 > 



PCT/US00/31152 

WO 01/35317 

^®^'®^^^^'^^*^^^^^^^^^^^^^ ^^^CGCGATGCGGGTGACCACOATGAAA 

rCTCAGTCGACGACCGCGACGACCTGTAT 



GCCGCATGCAAAACGGTGACGTTGACCGTCGACGGAAi 



10 



TCOC^TG^CGACATCGT^O^^^^ ^ T 3 C TGC GG CGTA G CC QTC CGCT 
CCCGC^GCCG^T^ 

GCAGATCTCGCTGGATGG^TCA^G^CG^^A/mC^^^^^^^^^^^^^^^^^^^^^^^^^^ 
AGOCGCTGGCCCAACT^CGA^OCGAOA^GCG^ ^^^^ 
GTCCCGCTGTCCGGGATGGCGC^GTOG^GCG GAGTGCGGCCG 



\GGTCACCGAGCGGCTGCCGCTGCCGCCG 

5CAGATCCAGG l wuuwwrv, ■ ~ 

^-^«/-« a <~i-s Arrrv5C5AGATGAAC^ 

AAC 



ATGCAGATCCAGGTGACCCGCAATCGGATCAAGAAC 

GGTTCCGGGGACCCAGGATGTGACGTTCGCGGTAGCTGAGGTCAACGGCGTCGAGACCGGCC 



.CGCGCGTCGTGTCGAGGACCCGGAGATGAACATGAGCCGGGAGGTCGTCGAAGACCCGGG 



20 GTATGTGCTGCACGAGCGGGTGCGCGCTGA 

> R V10,0 ^A 16 Sr R NA tfl n» m ,« ra n S .er3SeT B .se q 1129150:1130,OOMW:«647 

i^qfiiMTBH37RV H29150-1130103, ksgA SEQ ID NO:39 
^GTGC^3C^^G^^C^GGTGCGCGCTGACCATCCGGCTGCTCGGGCGCACTGAGATCAGOCG 



CGGGCCGGGCCTGGGATCGCTGACCCTGGCAC 



TCGAGATCGATCCACTACTGGCTTCTCGGCTGCAACA 



GACCGTGGCGGAGCACTCGCACAGCG 



AGGTTCACCGACTAACGGTGGTGAATCGCGA' 



30 CG' 



GCGCCGACCGCGGTGGTTGCCAATCTGCCGTA 



iCGTCCTGGCCCTGCGCCGGGAGGATCTAGCCG 
CAACGTAGCGGTACCGGCGTTGTTGCATC 



TGCTTGTCGAGTTCCCGTCGATCO 



GTGTCGTGACGGTGATGGTGCAGGCCGAGGTCGCCGAAC 



.CGGCGTGCCCAGCGTTAAGCTGCGCTTCTTC 



CTGGCCCATTCCGCGTGTCTAT 



GGCTCGCCGCCGAGCCGGGCAGCAAAGAGTA 

GGGCGGGTTCGCCGCTGCGGCATGGTGTCGCCGACCG1 _ , 

^C^i^^GT^SCGGTGAGACGCTGTCCATCGACGACTTCGTGCGGCTGCTGCGACGGTCCGG 

76 



BNSDOCIO: <WO 0135317A1 I > 
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WO 01/35317 

CGGCTCCGACGAGGCCACCAGCACCGGCCGGGACGCCAGGGCGCCGGACATTTCGGGGCAC 
GCGTCGGCGAGCTGA 

>Rv101 1- Hon,o 109 y » b* F**. YebH ™.seq 1 130189:1131 106 MW.31350 
k .»,«ii«!lMTBH37RVH30189-1131109.R«1011 SEQIDNO:40 

;Sacc"g7cacog^cgggtgcccggaaa^ 

^TCGC^CGAGGACGGCTATCACGAGCTGACCACGGTATTTCATGCCGTCTCGCTGGTCGAC 
^S^AACGCTGATGTGCTCTCGCTCGAGTT^TCGGCGAG^OCG^ 

^™S^aacgcaatct^^ 

^Lqq^q^^^^^CTCTCGATCATGATCGACAAATCCATTCCGGTCGCCGGCGGCATGGCCG 
^TGGCAGCGCGGAOjCTGCGGCGGTCCTGGTTGCGATGAACTCGTTGTGGGAACTCAATGTGC 

cc^gcgacctS^^^ 
^t!c™tggggac*ggtcgcg^ 

^I^^^^^^^^^^^ottcgccgacagcgggttgctcacctccgcggtgtagaacgagct 

cc^^^raC^GGreATCCGGATCAGCTGGCGCCGTTGCTGGGTAATGAAATGCAAOCGGCCG 



GTGCCTTCCTGTGCACCTCGGCGAGCTCGGCG 

3GAGTTTGTCGCACCGT 
GCCGGTACCCGGCGCCCGCGTGGTGTCTGCGCCGACCGAAGTGTGA 



>R»1 10 6C - cho!^ d.hyd ro9 enase TB.se., ,232845:1233954 MW:40743 . 

SSg^t^ocaacct^gagcacottgctggaccgcggg^ 

^^%!c^GCGCCGTCGCTGTTGCCTGCGCATCCGCAACTGGAGGTGCTGCAAGGGGA 
T^^T^^^^^^^^^^CTGCGCCGCGGCCGTGGACGGCATCGACACGATCTTCCACACCG 

ca^^tcgaIctgatgggcggcgcgtcggtcaccgacgagtaccgcoaacgtagctttg 

^^xn^^TCGG^G^^^^AGAACCTGCTGCACGCCGGCCAGCGGGCCGGGGTGCAGCG 
^^J^^^^^^^^^^q^^CAGTGTGGTGATGGGCGGCCAGAACATCGCCGGCGGTGACGA 
^^^^-j^^^^^^-^^^W^GGTTCAACGACCTCTACACCGAGACCAAGGTGGTTGCCGAGCG 
GACGCTGCCCTATACWa ^TGCTGACGTGCGCGATCCGGCCCAGCGGCAT 



ctgggga^cggcgatcagacga^^ 

CMGGTOCTGGTCGGGCGCAAGTCGGCCCGGCTGGATAACTCTTACGTGCACAACCTGATTCA 
^^^O^^n^^JCGCTGCCCATCTGGTGCCGGACGGCACAGCGCCCGGGCAGGCTTACTT 
n^C^CGAC^^^AGCCGATCAATATGTTCGAGTTCGCTCGGCCGGTGCTCGAGGCGTGCGG 
QC/^CGCT^^^^^^ATGCGGATTTCCGGCCCCGCGGTCCGCTGGGTAATGACGGGGTGGC 

77 
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^^^^^^^^^^ 

GAACGAGGCCCGGGCGGAAAAAACGGCCGCCACAGTCAAGCCGTAG 



>Rv1110 lytB2TB.seq 1236183:1237187 MW:36298 



^GCCGCGTGGCTACTGCGCTGGCGTGGATCG 



CGGCCCGCCTGTCTACGTGCGTCACGA 



CGTCCCAACCGTAAGCGGGTGCTGCTGGCCGA< 

10 GGCCGTCGAAACGGTCGAACGCGCG^^^^^ 

GATCGTGCATAACCGCCACGTGGTTGACACCX^GGCTA/^^^^^^^^^^^^^^^^^^ 



GACCGAGCAGGTTCCCGAGGGAGCGATTGT< 



GCGCCAGCGAGCGCAACCTGCAGGTCATTGACGCCACCTGCCCGCTGGTCACCA 
AGGTGCACAACGAGGCGAC GGAAGCT CCCGATCATGTGCAGCTGGTCGACG 



15 ^ c ^^^l A ^^^^^ Tax Z 

mum 



20 



CAACCGC 
CGCTGA 



25 



>RV1216C - TB.seq 1359473:1360144 MW:24863 

^^VcCGCTCGCGGAGGGCCGAACGATTOAGAAGTTCATCGTCATCGGCGCTTTrCT 



GGGGTTCTTCGCGATGATGGTGCTGAGCGCGT 



GCGACCATCGTTATGGTTGGTCGTCAGTGCC 



AGCCGCGGTGTGCGTGATCGGCGACGTCCTAGTGA 



TGACGGGCCTTGGCATCGCCATGCTGGT 



GGTCATCCAGAACAGGTATGCCGCCTCGAC 
DGGTCTCTACAAAATTGTCCGACACCCC 
CATACCGCTGGCACTGGGCTCTTACTGGGCGA 



GGTCAGGGTGGAGGCGGGCCAGATATTGGCCTC 



rr^-GTCTCTACAAAATTGTCCGACACCCGATGTACGCCGGGAACGTGGTCATGATGACAGG 
35 CGACGGTCTCTACAAAATTva xTGTTCA TCCTCGTCCCCGGCACACTGGTGTTG 



78 



BNSOOCID: <WO 01 3531 7 A1 



PCT/US00/31152 

WO 01/35317 

GTSTTCCGCATCCTCGACGAGGAAAAACTACTGACGCAAGAACTCAGCGGGTACCGCGAATACC 
GGCAACTGGTGCGCTACCGGTTGGTGCCCTACGTGTGGTAG 

>Rv1223 MrATB.seq 1365810:1367456 MW:56547 
< «mblAL123456|MTBH37RV:1365B10-1367459.MrA SEQ ID NO:44 

CTGAG^C^TTGTCGCAGCGCATGGCGGGGTTGCTGCGAGTTCATGGCGAGTGGTCGC^ATCC 

GTGGATACTAGGGTGGACACGGACAACGCGATGCCTGCACGTTTTAGCGCCCAGATTC^^AAT^ 
GAGGATGAGGTGACCTCCGACCAAGGCAACAACGGCGGCCCGAACGGCGGAGGCCGCCTGGC 
GCCGCGCCCGGTTTTTCGGCCACCGGTCGACCCGGCGTCGCGTCAAGCGTTCGGGCGTCCGT 
,0 CCGGGGTCCAAGGGTCCTTTGTGGCCGAGCGTGTGCGCCCGCAGAAGTACCAGGACCAGTCT 
' GACTTCACACCGAACGATCAGCTTGCTGACCCGGTGCTTCAGGAGGCGTTCGGTCGTCCGTTC 
GCGGGCGCCGAATCGCTGCAGCGCCATCCCATCGATGCCGGAGCGCTGGCAGCTGAGAAAGA 
CGGTGCCGGCCCCGACGAGCCCGACGATCCGTGGCGCGACCCCGCGGCCGCGGCCGCGCTG 
GGGACGCCAGCGCTAGCCGCGCCGGCACCGCACGGTGC6CTGGCCGGCAGCGGCAAGCTGG 
, 5 GTGTGCGCGACGTGCTGTrTGGCGGCAAGGTGTCCTACTTGGCGCTGGGCATCTTGGTCGCTA 
TCGCACTGGTGATCGGCGGCATCGGCGGTGTCATCGGCCGCAAGACCGCGGAAGTAGTCGAT 
GCGTTCACCACGTCGAAGGTGACCCTGTCGACCACTGGCAATGCCCAGGAACCGGCCGGCCG 
GT^CACCAAGGTGGCGGCCGCCGTGGCCGATTCGGTGGTGACCATTGAGTCGGTCAGCGACCA 
qq^^GCATCK^AGGTTCCGGCGTCATCGTCGATGGCCGCGGCTACATCGTCACCAACAATCA 

50 cg^gaTc^toaggcggccaacaatcccagccagttcaagacgaccgtggtg^caacgacgg 

caIggI^ccgccaatctggtgggtcgtgacccoaagaocgacttg^ 
^^cgtcgacaatctgaccgtggcccggctcggtgattccagcaaggtacgggtcggtga 

cgaagtcctcgcggtgggcgc^ccctggggctgcgoagtacggtgacc^agggcattgtca 

g^g^gctacaccgccccgttccgttgtcgggcgagggctctgacaccxsacaccgtcattgacg 

25 

rr AGGTGATTGGCATCAACAi 

S^ggtcaacgagatgaaatt<*tggcaaa^^ 
cat^cgacgttgggcatcagcacccggtcagtaagcaacgcgatcgcgtcgggcgcgcaggt 

gS^tgtaaaggcgggaagtcccgcgcagaagggggggatcttggagaacgatgtgatcgt 

30 c^ggtcg^^accgcgcggtcgccgactccgacgagttcgtcgtcgccgtgcgccagttgg 

CWCGGCCAGGACGOTCCGATAGAGGTGGTCCGCGAGGGTCGGCATGTGACGCTGACGGTG 
AAACCGGACCCCGATAGCACCTAG 

>Rv1224 -TB.seq 1367461:1367853 MW.14083 



CAATTCAGACCGACGCCTCGATCAACCACGGTAACTCCGGCGGTCCGCTAATCGACATGGATGC 
CCA^ST^^TTGGCATCAACACCGCCGGTAAGTCACTGTCGGATAGCGCCAGCGGGCTGGGCTT 



35 



>emb|AL123456|MTBH37RV:1367461-1367856. Rv1224 SEQ ID NO:45 
GTOTTCGCCAACATCGGTTGGTGGGAAATGCTCGTCCTCGTCATGGTCGGGCTGGTGGTGCTT 

GGCCCGGAGCGGCTCCCGGGTGCCATCCGCTGGGCGGCAAGCGCTCTGCGGCAGGCGCGCG 



79 
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VTTGGACCCGAATTCGATGATCTGCG 

2AGCGGTGT(jAOOmoov^^ww - 

TTTGACAGCGATGCCACCTAG 

^12290^ «H«S*n«*.P— 1371778:1372947 

ATGCCAAGCCGCCTACACTGGGC<WTGATGTC^^^^ c ^ tcaccqaactggqg 



ATACGCACCGCGCTGGGCAMjCTAATCGA^^^^^^^^^^^^^^^^^^^^^ 

CCGAAATCA* 
SGTCAGCTTG 
TACCCGCGA 
TCCGGTAAG 

G3CCGCCGCGATGGCCGTCCGGGGCCT- _ _ igcctacccaggttgagtcgatgatcctg c 



ATGG 



iTCAAAAGCATCGACACCGGCCCGGA 



GCCGGCTGCCCGAAGAAGTCCGAAATG 



ACCGAGCGTGTCACCCGGGCGGTCGCCGACGTGCC 



AGGCACTTCGGCGGTGCGGGTCAGCTTGGA< 



iCGTGATGAGCGACGAGCAGCGCACCGAGCTGC 



^^^^"Sc^GTCGATCGGGGTGCTGGACGCTGATATCCACGGCC 



^r^^r^.^CAOXAOGGCAA^ 
TGGTGTG 

GGGGATCTGGACGTGCTGCTGCTGGAC _ ^^^^^^^OO^CCGCGGA 



GGCTCAACTGATCCCCAACGCCGAACTCCTGG1 



CTGCAAACCCGCCAACGCATCGTCGGCGTCGTGG 
3CACCACGATGCAGGTGTTCGGCGAGGGCGGT 
CTGGTCGCCGAGCGGTTGTCGCGTGCGGTCGGCGCCGACGTGCCGCTGCTGGGTCA 



GGTGGCCGAACGGGCCGGCAGCATCGCGC 

AGAAC 
GGCC< 

GATCCCGCTGGACCCCGCAC - ATCQC CGACGGCTTGTCGACTCGACGAC 



.OAACATGTCGGGGCTCACGCTGCCGGACGGCACCACGATGCAGGTGTTCGGCGAGGGCGGT 

g ^ GCT ^ T ^ C ^ G0 ^ gc Wggcgattcgggcgtaccgctcgtgttgagct 



CGCCGGACTCGGCGATCGGCAAGGAACTGCATAGC/ 
GCGGATTGGCGGGCATGTCGCTGGGGTTGGACCCGACACGACGCTAG 

,^239=00* ^nes^andcoba^-spcK^TB-se, ,381943:1383040^:41470 
>«^1~™ g ~ 

C^^CACGAATCGGTGGTACTGGCCCGCGAGATCGTCAAAACCGGCGAGATCATGATCTTCGT 
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.^r^GGATTTCGTGGTCACCGTCCGCCACGGCGAACACGGCGGGTTATCCGAGGTGCGTAA 

GAG^W^^AGCGTTCGCGCCGGGCCGCAAGCTCGACATCGAACCGATCTATCTGCTCAAGCGG 
GA^GTGGTCGAGTTGCGCCGGTGCGTGAATCCGCTATCGACCGCATTCCAGCGCATGCAGACC 
GAGAG^AAAGA^CTCATTTCGAAAGAAGTGCGGCGCTACCTGCGCGACGTCGCCGACCACCAG 
^GG^CCGACCAGATCGOCAGCTACGACGACATGCTCAACTCGCTGGTGCAGGCCGC 
ACCGAGGCCGCCGACCAGATC VTGGACATGCGCAAGATATCCGCGTGGGCAGGTAT 



GCTCGCCCGGGTCGGCATGCAGCAAAACA' 



r»Trr. M GTOCCCAOCATGATCGCGGGCATCTATGGCATGAACTTTCACTTCATGOCCGAGCTG 
GACTCCAGGTGG^^TACCCGACAGTGATCGGC^GGATGGTCCTTATCTGTCTGTTCCTCTACC 

ACGTCTTCCGCAACAGAAACTGGCTCTAG 

>RV1279 - TB.seq 1430060:1431643 MW:57332 

k.ai io-^«5BlMTBH37RV-1430060-1431646, Rv1279 SEQ ID NO:48 
ATGGACACTC^GAG^GXcTACGTCGTGGTCGGTACCGGCTCAGCCGGGGCGGTrGTGGCCAG 

C^CGGCT^GCACCGATCCGGCCACGACGGTGGTGGCCCTGGAGGCGGGGCCGCGTGACAAGA 

AC^&AlTcl^rcG^CGTCCCAGCGGCGTTTTCCAAGCTGTTCCGCAGCGAGATCGACTGGGA^A 

CCTA^CCGW^CGCAGCCGGAGCTCGACGGCCGCGAAATCTATTGGCCTCGTGGCAAG^TGCT 

C^^^GCTCGTCGTCCATGAACGCAATGATGTGGGTGCGTGGATTCGCATCAGACTACGATGA 

qtqqqcCG^^C^^GCCGGTCCGCGGTGGTCGTACGCCGACGTGCTCGGCTACTTTCGCCGCA 

TCGAGAACG^^^CGCTGCCTGGCACTTTGTCAGCGGTGACGACAGCGGAGTAACOMTCCGT 

yQ^TA^^rcCCGGCAACGCAGCCCAAGATGGGTGACCGCAGCGTGGCTGGCAGCCGCACGTG 

AGTCCGG^TrTGCCGCTGCGCGGCCGAATTCCCCTCGACCGGAAGGCTTTTGCGAGACCGTCG 

TCAC^GAG^GCCGCGGTGCTCGATTCAGTACTGCCGACGCCTATCTGAAGGCCGCGATGCGCC 

G^AAAAA^CTCCGTGTGCTTACCGGCGCCACTGCTACCCGGGTGGTCATCGACGGCGW^CGGG 

CCGT^GGCGTGGAATACCAAAGCGACGGTCAAACCCGCATCGTCTACGCCCGCCGCGAGGTG 

^TOCTC^GC^CTOGTGCCGTCAACAGCCCTCAGCTGCTGATGCTCTGCGGCATCGGCGACCGC 

gIccIcctgggogaacaogacatcgacaccgtttaccacgcgcccgaggtcgggtgcaacctg 
GACCACCTCGCCGAAU ACGTC gaaaaggacagcttgtttgccgccgaga 



CTCGATCATCTCGTCACGGTGCTGGGTTTCG/ 



.CGCCGCGGCATGCTCACCTCCAACGTCGGCG 



GCCT^l^^^CGAACCGCGTTACCTGTCCGATCTCGGTGGCGTAGAGCGGGCCGCCATGATGGC 

Sg^t^tatgcgcgcggatcgcggaggcccgcccgctcagagatctocttg^^ 

^^3CGCGACCGCGCAACAGCACCGAGCTGGACGAGGCCACTCTCGAGTTGGCGCTGGCCACT 
T^rc^CCTGTACCACCCGATGGGCACCTGCCGCATGGGOAGCGACGAGGOCAGCGT 

81 
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GACTTAATCCGCAGCTGA 

>RV 1294 mrA hcmoseHn. de.ydr09e.ase TMq 1449373:1450695 MW:4S522 



CGGTGGAACCGTCGCGCAAGGCGATCC 
GCGAACAAGGCTTTACTCGCCACCTCCACCGGO 
GTTGATCTGTATTTCGAGGCGGCCGTGGCGGG " 



TGGGCGCCCTTGAGCGCGGCAAGTCCGTCGTTACG 



GAATTGGCACAGGCCGCCGAAAGCGCCCAT 
CGCCATTCCGGTCATCCGTCCGCTCACCCAG 



CG< 



TCGA< 



GCCCTGGTACCTCTGTCGCATCCGCTTGCCC 
GGCCGAGGCCGCGGGCCGGCTGATGTTCT 



ACGGCCAGGGCGCGGGCGGCGCGCCGAC 



CGCCTCTGCGGTGACCGGTGACCTAGTGATGGCCG 



CCCGCAACCGGGTACTCGGCAGCCGCG 



TGGCACCAATGGGTTTCATTGAAACGCGCTA 
CGCCGACAAGCC^GGCGTCTTGTCCGCGGTGGCGGCGGAA^CGC 
CAGGAGGGCGTTGTGGACGAAGGTGGTCGAC 
CCTCGCCACTGACGCCGCACTCTCGGAAACC 
GGGTGTGTCCAGCGTGATACGACTGGAAGGA 



GCCCCCGTGAGTCTAAATACGCTCAACTTCCGGl 
GTTGATGCACTGGACGACTTGGATGTCGTGCA 



ACCGGCTTATGA 

, 1, > tr 14858601487026 MW:40049 
3 >Rv1323 «M ace*.-CoA C-ece.y«rans1erase (eKa M 

<^mnurTBH37RV-1485S60-14e7029.1adM SEQIDNO.50 



<WO 0135317A1 t > 



PCTAJSOO/31152 

WO 01/35317 

^^^rTACGGCGACGTTACGGTTTTGGACCACATGGCCTACGACGGTCTGCACGACG 
CGGGTTAC^GTACG^ 

TG TcG^ 



GCTCCGAACA^^« . — ATCCCGC AGCGCACGGGCGATCCACTGCA 
GGC GTATTCGCCGACGAGGTGATCC^ 

G ^CACCGAGG = 

GCCGCGGTGGTGGTCATGAACCAGGAAAA CAAT CGCAGCCGGCCAACG 
G ATCGGCGCCCACGGTGTGGTGGCCGGG = 



TCAACAAGGCGCTGGATCGCGAGGGCAT 



^CGCGAACTCGGGCTGAACCCCCAGATCG 



VTTGCCGTCGGGCATCCCCTCGGCATGTCAGGGACGCGAATCA 



CGA. _ 

ACGAGGCGTTCGCTGCGGTGGCATTGGCCTCGATAC 

rrsr&s 

GGGGCTGGCGGGCAGGGCGACGCACTGATATTGCGGGCCGGATAG 

GTGAGCGTCGGCGAGGGACCGGACACCAAGCCGACC^TGGCCAA^ 
CACGTGTGGTGGTGCTGTCCGGTGG^^ 

GCGAGCGGATCCCGAATCTGCATTTCAGTGTG^^GCCAC^AC^^^^^^ ^^^^^^^^^ 

GG TGAGTTGCTGGAATGGGCAGAAATC^ 
GCCGGTGCGGGCGGCCGCGGCG^ 

CCAGGGCGATCAAGAAGACGATGCCCGAGGCTGTCA^T^ 



GA 
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GCGAACTCGCTCCTGATGCGCAGAACGATCCGATCGGGCATGCCGCGTTCGTGCATG^GCATC 



10 



CCCGATGGATCGCCCAGGCCTTTGCTGACGCGTTGGGCGCGGCGGTCGGGGAGGTCG_AGGCA 

3ACGAACGGC 

CGCCGGCGAACTGGCCCGCGCGGTGCGCGGAACCGTCGGTCGGTATTCGCCGTTTGCGGTGT 

ATCTGCCGCGCGGTGACCCGGGGCl 
GGACGAGGGCAGCCAGTTAGTCGCCCGAGCA" 

GACGGTGGCTGG 
TTGCAGTGCGCAl 

GGATGCGCCCTGCACCGGGCTGGGCGCGTTACGCCGT 



atctGCCGCGCGGTGACCCGGGGCGACTGGCGCCGGTGCGCGACGGCCAAGCGCTGGTCCA 
5 ATCTGCCGCGCGGTGACU - GAGCATTGACC CTGGCGCCAGTCGACGGCGATACCG 
CGGTGGCTGGACCTGTGTGCCGGACCGGGCGGGAAGACGGCGCTGTTGGCCGGGCTGGGT 

ACAGAACACCCGGGGGCTGCCGGTTGAGCTCTTGCGTGTCGACGGGCGGOACACCGACCTCG 



GACeu . , — — T ^ CGOTGTGGAACCC TCGCCACACCGCGCGGACCTGGTAGC 



ACCCGGGTTTCGACCGGGTGCTGGT' 



CGGCCGGAGGCCCGTTGGCGTCG 



TCAGCCGGCGGACGTAGCGGCACTGGCCAAGCTACAACG 



CGAGTTGTTGAGCGCCGCCATCG 1 



CGCTGACTCGGCCCGGCGGTGTCGTGCTCTATGCCACATG 



;GCCGCACCTGGCCGAGACTGTGGGTGCTGTCGCCGACGCGCTACGCCGACATCCGGTTCA 

;CG 

CGCCATGTTCGCCGCGGCGTTGCGCCGCCTG 



CTCGcc— , GCCGGTGATCGCGGGGCTGGGGGAGGGGCCCCACG 



CGCGCTCGATACCCGCCCACTGTTCGA 
15 TTCAGCTGTGGCCGCACCGGCACGGTACCGA 

ACGTGA 



20 



25 



>RV1409 ribG riboflavin biosynthesis TB.seq 1585192:1586208 MW:35367 

w.a, <o^«rimtbH37RV-1585192-1586211. ribG SEQ ID NO:53 
ATC^^GTO^^GC^!&3TCAAGAGCATCGACGAGGCTATGGGTCTCGCCATCGAGCACTCCTAC 

ATGAACGTGGAGCASerL. 1A . „ cra - AGTGGGGGC CGTCATTGTGGATCCCAACGGT 

C^* 700 ^ 00 ^^^ CCGGCGCCAT CG TGGTGGTCACCATGGAACCCTGTAAG 

^ctaSgc^^^S^Sog^SSgctctgatcgaagccagggtggggacgg^^; 

TAC^CGTC^CGACCCGAACGGGATCGCTGGGGGTGGCGCGGGCCGGCTGTCAGCAGCGG^ 



GCCTACAGGTGCGGTCCGGGGTGTTGG' 



CTGAACAGGTGGCGGCCGGACCGCTGCGGGAGTGG 



\CAAGCAACGCACCGGTCTGCCGCATGTCACC 



TGGAAGTACGCCACCAGCATCGACGGC 



CTCCACaaww^wv- . ~- ^^^ CAGCGAGG CCGCACGCCTGGATCT 



:gccgccgccgacggctccagccagtggatctcc 



rCATCGCCGCCG^^^^ 
30 ACCCGGC^ 

30 G^"GGTGGGC^^GCGCGACATACCGCCGGAAGCACGGGTCCTCAACGACGAGGCACGCACCA 
TGATGA" 



TCCGCACCCACGAACCTATGGAGGTGCTCAGGGCGTTGTCGGATCGCACCGACGTGC 
jQCTGGA^^GA^GTCW^CCCTCGCCGGCGCCTTCCTACGAGCGGGTGCGATCAACCGGATCC 



JCCTGTTGGGCGGTCCGGTTACCGCGGTCGATGACGTCGGGGTGT 
TGCGT 

GCTGAGCTTGGTGGCTCGTTAG 



TGGCCTACGTCGCACCGA 

35 



CCA^ATCACCAACGCGTTGCGTTGGCAGTTCGACAGCGTCGAAAAGGTCGGACCGGATCTGTT 



84 
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>RV 1440 secG TB.seq 1617715:1618065 MW:12140 

> emb |ALl23456,MTBH37RV:1617715-16l8068 serf [^^ TGAGGCGCG ^ GGCCTGG 
GTGGCAGGCGTGACAGCCGCGGTCAGTGCACG^ 

TGTCCATCATCGGCGTGGCGTTGCTCATCAAATACCGCTAG 



10 



>RV1484 M* TB.seq 1674200:1675006 MW26529 
> OT ^12*S 8 ,MTBH37 R V : 16742^~ 

GGTTTCACATCGCACGGGTAGCCCAGGW^AGGGCGCCCAGCTG^^^^^^^^^^^^ 
C^CTGCGGCTGATTCAGCGCATCACCGAC^CTGCCGGGAAA CGGGG 
16 CGACGTGCAAAACGAGGAGO^CT^^^AGCTrGGCCGGCCGGGT^^^^^^^^^^^ 
CGGGCAACAAGCTCG^GGGGTGGTGCA^ 

ATCAACCCGTTCTTCGACGCGCCCTACOCa** CCGG AGG7TCCATCGTCGGCA 
CGTATGCTTCGATGGCCAAGGCGCTGCTGCCGA7W^GAACCCCGGAG^7T|^^^^^^^^^^ 

TGGACTTCGACCCGAG^GG^^GAT^^GGC^ACAACTOGAr^ACG^^^^^^^^^^ 

20 TTGGAGTCGGTCAACAGGTTCGTGGCGCGCGAGGC ^^^^^^^ 

OTTGCCGCAGGCCCTATCC^^O^ATGA^^ g ^ ct ^^ 

GCAGGGCGGCGCCCAGATC^GC^ 

25 > R v,6i7 W ^ TB - i r«or N °or :50668 

GTGAOGAGACGCGGGAAAATCGTCTG^CTC C1TCAGCCACG GCGACTACGA 



30 



35 



G^^GCTCGCCGACCTGCAGGGCCGGAAGArrcAGGT^GGG^^CTrc^^^^^^™»^^™^ 
CACTGGGCCGAAGGCGAAACCGTCCGGATCA^GTGGGGG^TGCG ^ 
GGTGTCCACCACCTACAAGCGGCTAGCCCAGGACGOGGT^GCCGGT ^^^^^ 
ACGACGGCAAAGTCGCATTGGTGGTCG/^^^^^GAGG^CGACG ^^^^^^^^^ 

OTGGAAGGCGGCCCGGTCAGCGAC^^CATCTCG G ^ CCTCGGCGTCGACAT 

^^^^^^^<^^^ 

85 
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GGTGCCGGTGATCGCCAAGCTGGAGAAGCCGGAAGCCATCGACAATCTCG 

TO 
iCC 
AT< 

rGC- 

CCGGACAATGTCGCGCATCATCTGCGCGGT 



AAGCGATCGTGCTGGCGTTi 
CGCTCGAAGAGGTCCCGCTGGTAI 
CCGGTCATTGTGGCGACCCAGATGCTCGACTCGA 



CAGAAGCGAGCCATCCAGATGGCCCGGGAGAACGCCAAG 
JGATCGAGAACTCGCGGCCGACCCGAGCT 



GAGGCCTCCGACGTCGCCAACGCGGTGCTCGA 



TGGCGCCGACGCGCTGATGCTGTCCGGGGA 



GGTCCCGC 
TGGCGACO 
5ACGTCGCC 
GGGAAGTA 

e^G^^ 

^^SSg^ggcccocctgcataccc^ 

GTCCGGTGATACCGTGCG<*. TGQMATGACCTGGGG CACCGAOACGTTCATCGTGC 

GC^CAAG^G^GGTGACTTGGTGGTCATCGTCGCGGGTGCGCCGCCAGGCACAGTGGGTTCGA 
CCAACCTGATCCACGTGCACCGGATCGGGGAAGATGACGTCTAG 

15 >R,1630 -psA 30S rtoosomal protein S1 TB..eq 1833540-., 834982 MW:S3203 
■ i m I -,o^ieR|KiiTBH37RV-1833540-1834985. rpsA SEQ ID NO:57 

^w^^COG^AATAGACAAAACGATCAAGTACTTCAACGATGGCGAGATCGTCGAAGGCACCA 

tcctx!aa^gtggac^^ggacgaggtgctcctcgacatcggctacaagaccgaaggcgtgatcc 

20 A gg^a^gg^^^ 

^cgc^cag^a^g/^cgtgcctggggcaccatcgaggcgctcaaggagaagqacgaggccgtc 

aagggcacggtcatcgaggtcgtcaagggtggcctgatcctcgacatcgg^sctgcgcggtti^c 

^gaggccaagatcatcg^^^ 
25 ctg^^tggagc^g^^cagtccgaggtgcgcagcgagttcctgaataacttgcaaaaaggcac 

catocgaa^g^gVgtcgtgtcctcgatcgtcaacttcggcgcgttcgtcgatctcggcg^stgt 

ggac^^^tg^^gcatgtctccgagctatcgtggaagcacatcgaccacccgtccgaggtggt 

^g^ggt^gaggtgaccgtcgaggtgctcgacgtcgacatggaccgtga^ 

QjYGTCACTCAAG^^G4Xn , CAGGAAGACCCGTGGCGGCACTTCGCCC^CACTCACGCGATCGG 
GCAGATC^rcCWSGGCAAGGTCACCAAGTTGGTTCCGTTCGGTGCATTCGTCCGC^TCGAGGA 

ggg^atcgIgggcctggtgcacatctccgagotggccgagcgtcacgtcgaggt^ccgatc 
aStgI^gStcggcgacgacgcgatggtcaaggtcatcgacatcgacctggagc^ 

gg^c^ggctgaagca^^ 
35 acggcatogc^^acagttacgacgagcagggcaactacatcttccccgagggcttcgatgccg 

a^aco^c^am^gcttgagggattcgaaaagcagcgcgccgaatgggaagctcggtacgccg 

^^cgcggggacaagatgcacaccgcgcagatggagaagttcgccgccgccgaggcg 
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K3TAGCGCACCGTCGGAAAAGACCGCGGG 

TGGATCACTGGCCAGCGACGCCCAGCTGGCGC 



GA 

5 >RV1631 - TB.seq 1835011:1836231 MW:44669 

k.a. <^<ifilMTBH37RV-183501 1-1836234. Rv1631 SEQ ID NO.58 

5ss=s== 



15 CGT 



CGCCGACGTCTGGCTGGACAACTCGGGCA 



ACGTCTGGAACACGCGCGTCCA 



GCCCTTCGCGCACAACCTGGCCCAACGTCAGATTGCGCGCG 



CGCCGGCTAGGTTGGTGCCGGCGGA 



JCCAAGCTGGCCGGATCAGGCGCGGCGCATCGTCAAC 



CGGCTAAAGATCGCGTGCGGGCA" 



TAAGGCCTTGCGAGTTGACCACATTGGGTCAACCGCCGTG 



CTAGCCAAGGATGTCATCGACATCCAGGTCACCGTCGAATCACTTG 

GCCGCCGGCTACCCACGCCTCGAGCACATC 
AGGACACCGAAAAGACCGACGCTCGCAGCACCGTCGGCCGCTACGACCACACCGACAGT 



TCGGGCTTCCCCGATTTT 1 

ACGT< 
ACCC 
GCCG 
CCTG 



ACGTGGCCGACGAGCTGGCCGAGCCCTTGCTG 



CGGGTGCACGGCTGGCCCAACCAACAGTTCGCCCTGCTGTTCGTCGACTGGCTGGCGGC 
f*AAT^CC(^3CGCGAGAGAAGACTATTTGACGGTCAAGTGTGACGCCGACAGGCGCGCCGACG 
25 ctG^^^TCGCG^GCTACGTCACCGCCAAGGAGCCGTGGTTCCTGGATGCCTACCAGCGGGCAT 



GGGAGTGGGCGGATGCGGTGCACTGGCGTCCCTGA 
>Rv1706c - TB.seq 1932695:1933876 MW:39779 



>KVlfUDC- 

k.ai io^filMTBH37RVd 933876-1 932692. PPE SEQ ID NO:59 
, AT^A^icT^GATGTCCC^GTCAACCAGGGGCATGTCCCCCCGGGCAGCGTCGCCTGCTGCCTr 

^^^r^TCAC^GCCGTTGCTGACGGCATCGCCGGGCATTCCCTGTCCAACTTTGGGGCGTTA 

^G^^G^ACGGGCTGGCCGCAGAGTTGTCGTCGGCAGCGACTG^TACGGTGCGG 
^TCTC^GCTtACAAACATGCGGTGGTGGTCGGGGCCGGCATCGGATTCGATGGTGGCC 
3S GCCGTC^^^C^CT^GTCGGCTGGCTGAGTACCACCGCGACGCTAGCCGAACAGGCCGCGATG 

Sggctagg^cgcagcggcctttgaagccgccttcgccatgacggtg^^ 

^TCGCGGCCAACCGGACCTTGTTGATGACGCTCGTCGATACCAACTGGTTCGGGCAAAACAC 
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WO 01/35317 

tStc^^gttgtggagcaccgcaitggcagaagccggtgcggocgaggcagcggcc 

^^^I^^^T^jqq a t(3(3tCGGCGTGGTCGCAGCTGGGAGCTGGACCGGTGGCGG 
^^OOQOaAC^^^^^GC^Al^^CGGACCGATGTCGGTGCCGCCGGGCTGGTCCG^^CG 

c^g^gcSca^^ 

q^^^^^^^^XCGGTCCTACTCCGGGGGGCACCGACTCCGGGCAGGAGTCGCGCCGCCC 
^^^^^^ACGOO^^ATGGAAGACGACTCACCGTGATGGCTGACCGGCCGAACGTCGGATAG 

>R,1745C. s lmte r« O Q46 8 22ORF.Oie2TB. S e,1 9 7,3e,:19719e9 M W:22490 
-„,, 19-WSR1MTBH37RV"C1971999-1971378. Rv1745c SEG ID NO:60 

^g^t^^g^ggccgctatggcc^tggcacggtggaaaacgaga^ 

C^^CTTCOTG^C^CT^^^A^TGG^^rACCTGACCAAACTTGGACCATGTCCGGCACAGTGGC 

ccgtggccgacgactgccggctaccgaaagccgcacatggtaattaa 

»Rv1800 -TB.seq 2039451:2041415 MW:67068 

i->id«ilMTBH37RV-2039451-2041418, PPE SEQ ID NO:61 

at^ctg^c^avittcgc^gtgctgccccccgaggtcaattcggcgagggtgttcgccggtgcg 
^^^tottagcggcagcggccgcctgggatgatcta^ccgagotgcattot 
gc^gca^^^^^^^gg^^tcggttacgxc^ggattggtggttgggtggtggcagggatcggcg 
t^gg^a^tggacgcagccgcgtogtacatcgggtggctgagcacgtcggctgccca 
^^^^agggcgtog^cggtctggctcgggccgcggtatcggtgttcgaggaggcgctg^^c 

Sacggt^atccggcgatggttgcggcaaatcg^^^ 

^^^ GCAGMCG cgcctgcgatcgccgcgctcgaatccttgtatgagtgtatgtgg 

GC^CA^Gi^^CAG^GGCCATGGCGGGTTATTACGTrGGGGCTTCGGCGGTGGCCACACAG^G 
^^^^QQ^gZ^jQQ^cGGCTACAGAGCATCCCCGGCGCCGCCAGTCTTGATGCCCGTCTGCCG 
^QCTCG^^CGAGGC^CCGATGGGAG7CGTCCGCGCGGTCAACAGCGCGATCGCCGCCAATGC 



0135317A1 I > 



PCT/USO0/3U52 

WO 01/35317 

COCTCTTCACGCCCC«G^TCT^ CGGCAGCAAA rrGCCGCCGGCAACAA 



CCAATGGCGGGGTTGCCACCAGGTTC JCTACACCATCGAATACGACGGCGTCG 



CCGGGGCCACTCCGCACAATCTGTACCCGACCAAGA 



CGACCCTCAACGCCATTGCCGGGACCTACTA 
CTGACGCCGGAACAAATTGACGCAGCGGTTCCGCTGACCAAT 



CCGACTrrCCGGGGTACCCGCTCAACTTTGTGT 
10 CGTGCACTCCAACTACTT^GCTGACGCC^G^ ^ GAACCTGCCGC TGCTAGAG 



CG, 



ACTCCGTTCGGGTTGTTCCCAGAGGTCAC 



CAGACGGGTCGACGATGCCAAGCACCGCAC^ ccg ^ tcgc ^ atcatttcgaag 
^TCGTGCCGTCGTACAACATCCACCTTrnTTI 



GATCGTGCCU . w» ■ ^"""^-..-c actcg0GGC CGACGTGGCACTGTTCACGGCC 

20 ccgatgggactcgtcaacgogg^^cc^ 



VCTCGTCAACGCGGTCGGATACCCACT 
GCAGGCGGTCTTCAGCTCTTGATCATCATCAGCGCGG 

CCATTGTCCCCTGA 

>R V 1844C gnd 6-phosphog.uconate dehydrogenase (Gram -) TB.seo 2093732 20 951 86 
>Rv1844c gna p K 5186 . 209372 9. gnd SEQ ID NO:62 

25 MW.51548 >emb|AL123456|MTBM3/KV.czu _ A ^ AT „_ rrTCACTGGCCTG GCCGTGATG 

GGTTCCAACATCGCCCGAAAU i wv, A rGGCAAGTTCGTGCGCAGTGAA 
^^^^^^^^^^^^^^^^^^CTQ^^^^ACCGCGT^^G^^CT^CrcATGGTCAAGGCC 



ACGATCCCCGAATTTCTTGCCGCAC . - TCAACGAACTT<3CTGACGC CATGGAACCCGGCGAC 
.TCATCGAOGGCGGCAATGCGTTGTACACCGACACCATGCGCCGCGAGAAAGCGATGCGT 



30 GGAGAGGCCACTGACGCTGACGCTGTC 

ATCATCAT C C3A^o^ \TCTCCGGCGGCGAAGAGGGCGCGTTGAACGG 



GAGCGGGGCTTGCACTTCGTCGGGGCCGGGA1 



35 



89 



BNSDOCIO: <WO 0135317A1 I > 



PCT/US00/311S2 

WO 01/35317 

;aaggaagcctttga( 

TTGTGGCCCCGTATTTCCGCGGCGCCGTCGAATCGGCGATCGACAGTTGGCGGCGTGTGGTGT 



rGGGTATCCCGACCCCGGGATTCTCGTCGGCCCTGTCGTATTACGACG 



GGGCGAAGTTCCTCAACCACATCAAGGAAGCCTTTGACGCCAGCCCGAACCTGG^CAGT^TGA^ 
TTGTGGCCCCGTATTTC( 

GAAGTACCGGTGTAG 

>Rv1900c HpJ TB.seq 2146246:2147631 MW:49685 

u.ai ^^filMTBH37RV c2147631 -2146243. lipJ SEQ ID NO:63 

TGATCCGCCTCGACCATCGTGGGGTCGGCCTGTCG 



20 



GGCCGAAGTTCTGGGCCCAGGACGCGATCGCGGTGA 
ACAATTTTCGCC 

GGGTGCGCAGCCTGATCGTCGTG 



JGGACGCGGTCGGATGCGAGCAGGCG 



^^TTTTCGCGCCCAGTTTCCACGCCATGAACGGACTTGTTCTCGCCGCCGACTACCCCGAGC 
ACAATTTTCGCGCCCAGTTTC ^ cggctcggcgcgcccactatggG cgCCCGACTACCCG 



CCTGACGGTGGCGCTGGAACCGGATGCCGTC 



^^^TrTrGCCGGCAACCGTGCCGGACCGCCGAGCATTGCCCGTGCCGTTTCAAAG 
GTCATAGCCGAGGCCGAC^^ 



GCCTC 
GTCAT 



GCGATCGCGTGCGCGGACGACATCGTCGACGCGG 
TAT 

TCGACCGTGCGAGACATCGTCGCCGGATCA 



CGGTATTCATGCGGGCGAGGTCGA 



GGTGCGCGATGCCTCGCACGGTACCGACGTCGCCGGCG 



GGCCGGACCCAGTGAGGTGCTGGTGTCC 



35 TGGCCGTGCATATCGGT^GGGCGTCTGCGCG^^^^^^ 



90 
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ACTCAAGGGCGTACCGGGCAGATGGCGGCTATGCGTGCTCATGCGCGACGACGCCACCCGCA 
CGCGCTAA 

>Rv1967 - TB.seq 2210599:2211624 MW:36516 

k.ai -io^«;rimtBH37RV-2210599-221 1627, Rv1967 SEQ ID NO:64 
ATGAGGGAGAACCT^GGC^GCGTCGTGGTGC^ 

^™^^TGATTGCCGTCTTCGGGGAGGTGCGCnCGGCGACGGCAAGACCTACTA 
™^CGC^TGTCOAATCTGCGAACGGGCAAGCTGGTGCGCATCGCCGGCGTCGA 
GGTCGG^AAGGTCA^CAGGATCTCCATCAACCCCGACGCGACGGTGCGGGTGCAGTTCACCGC 



CAACTCGGTCACCCTCACGCGGGGCACCCGGGCGGTGATCCGCTACGACAACCTGTTCGG 
FGGAGGAAGGGGC 

GATTCCGTTGGCGCGCACCCAACCGGCGTTGGATCTGGATGCCCTGATCGGTGGATTCAAGCC 



10 ^C^G^CT^^ 



CrTGTTTCGTGCGCTGAACCCCGAGCAGGTCAACGCGCTGAGCGAACAGTTGCTGCACGCGTr 
TGCCGG>^^GGGGCCCACGATCGGGTCATTGCTGGCCCAGTCCGCGGCCGTGACCAACACCC 
TQGCCGACCGTGATCGGCTGATC^SGGCAGGTGATCACCAACCTCAACGTGGTGCTGGGCTCGC 
" ^CGCTCACA^TCGGTTGGACCAGGCGGTGACGTCGCTATCAGCGTTGATTCACCGGC 

tcgcgIaIc^gagcgacatctccaacgccgtggcctagaccaac^^ 

GTCGCCG^CT^TGTCGCAGGCTCGCGCGCCGTTGGCGAAGGTGGTTCGCGAGACCGArCG 

ggtg^ccggcatcgcggccgccgaccacgactacctcgacaatctgctca^cao^^gccgga 

^L»TACCAGGCGCTGGTCCGCCAGGGTATGTACGGCGACTTCTTCGCCTTCTACCTGTGCGAC 
20 ^jcGT^CT^AAGGTCAACGGCAAGGGCGGCCAGCCGGTGTACATCAAGCTGGCCGGTCAGGA 

CAGCGGGCGGTGCGCGCCGAAATGA 

>Rv1975 - TB.seq 2218050:2218712 MW:23650 

. . » | <o^cri M tBH37RV-221 8050-221 871 5. Rv1975 SEQ ID NO:65 
25 ATGTCGCGT^3AG^ATCG<^CACGTGTGCCTTGTCCGCGACCACCGCCGTCGCCATAATGGCT 

GC^C^CGCCG^^^GGCCGACGACAAGCGGCTCAACGACGGCGTGGTCGCCAACGTCTACAC 
^^CG^GCCGGCTGCACCAACGACGTCACGATCAAOCCGCAACTACAATTGGCCGC 
^^T.^qq^^^qccTCGATCTGCTGAACAACCGGCACCTCAACGACGACACCGGTTCTGACGG 

c^^caatcccgccgtagcgatcagcggcatcgagttgataaaccagtggtactacaaccccgc 

GTTrrTCGCGATCATGTCCGACTGCGCCAACACCCAGATCGGGGTGTGGTCAGAAAACAGCCC^ 

gIa^gc^tcgtggtggccgtttacggacagcccgatcgaccttccgcgatgocgcccag 

qqG^^^^^^^^^GACCGCCGTCCCCGGTGGCCGCGCAAGAGAACGTTCCTATCGACCCCA 
35 GCCCCGACTACGACGCCAGCGACGAGATCGAATACGGCATCAACTGGCTGCCATGGATCCTGC 

GCGGCGTGTACCCGCCGCCCGCAATGCCGCCGCAGTAG 



91 
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>R<1981C nriF *o„uceo«d. » subunK TB.-, 222422, -.2225186 MW:355 91 

w, a. ^^f5iMTBH37RV c22251 86-222421 8. nrdF SEQ ID NO:66 

GATTTGCAGGTCTGGGAACGTTT ^qjxcCACCGAGCAGCAGACGACGATCCGGGTGTTCA 

CGGTGGGAGCAGTGGCCATGATCGACGAC 



ACGACCTGGCATCTTGGCAAACGTTGAC 
CCGGCTTGACCCTGCTCGACACCGCGCAGGCGAi 



->rf-GTCACCCCCCACGAAGAGGCGGTCCTGACCAACATGGCGTTCATGGAGTCAGTGCACGCC 

"g^ct^ 



AGCGAAAAGCGCAGATCATCGTCGACTACTACCGCGGTG 
CGACGCGCTCAAGCGCAAAGCATCGTCGGTAATGCTGGAGTCCTTCCTGTTCTACTCCGGCTT 

CTACC^ 



GGTCGGAACAGAACCCTTACCTGG 
10 A< 



GATCAl^o^ ™ .CCGCGAATACACCTGCGAGCTGCTGCACACGCT 

CTACGCGAACGAGATCGACTATGCGCACGAC 



GACCTGACCGACGCCGAGCGGGCCGACCA* 



CTTGTACGACGAGTTGGGCTGGACCGACGACGT 



TCCGAGATGAAGCCGTCCACGGCTACTACATCGGCTACAAATGTCAACGAGGTTTGGCC 

GCGGGCCGACCACCGCG 

KCTATGCGCACGACTTGT 
TTTrCCCTACATGCGTTAGAACGCGAACAAGGCGCTAGCCAACCTGGGATACCAGCCTGCATTC 

15 gTtcg^ 

GAACGACGACTTTTTCTCCGGCTCCGGAAGCTCATACGTAATGGGCACC^ 
GACACCGACTGGGACTTCTAA 

20 >Rv2092chelY helicase, Ski2 subfamily TB.seq 2349335:2352052 MW:99576 
,b|AL12MS6|MTBH37RV : c23520S2-2349332.heIY SEQ ID NO:67 



>emt 



GTGACTGA 



GCTGGCCGAGCTGGACCGGTTCACO 



GCGGAACTACCGTTCTCGCTCGACGACTTT 



CA 



GCAGCGGGCTTGCAGCGCGCTG' 



GAACGCGGCCACGGTGTGCTGGTGTGCGCGCCGACCG 



GCGCTGGCAAGACGGTGGTCGGCGA 



GTTCGCCGTGCACCTGGCGCTGGCGGCCGGCAGTAAA 



gcacgccgctgaaagccctgagcaaccaaaagcacaccgatctcacagcacgct 

rT^GTGATGACCACCGA^^ 

TCGCCGACCGGATGCGGGGTCCGGTGTGG 



GAGG^:^ 
^CGGCG^ 

ACGAGCATCGGCCGGTGCCGTTGTGGCAACACGTC 
TCGATTACCGGATCGGCGAAGCCGAAGGGCA 
TCGCGCATCGCCGTGAGGCCGACCGGATGGC 



30 CAACGCCGAGGAGTTCGGCGGTTGGATCCAGA^^^^^^^^^^^^^^^^^^^^ 

TCGATTACCGGATCGGCGAAGCCGAAGGGCAGCCCCAAGTCAACCGCGAGTTGCTGCGCCACA 
TCGATTACCGGATCOo 3CCGATTGGCAGCCTCGGCGCCGAGGCTCGGGC 



CGGCCCGGCTTCTACCGGCCACCCGGCCGA 



CCCGAGGTGATCGCCAAACTCGACGCTGAAG6 



35 



GCTGTTGCCGGCGATCACCTTCGTGTTCTCCCGGGCCGGTTGTGACGCCGCGGTCACCCAATG 

VCCTGGCGGTACTCGGCTACTACGAATGGCGG 



CCTGCGGTCACCGCTGCGGTTGACCAGCGAAGAGGAGCGCGCACGGATCGCCGAGGTGATCG 

ACCACCGCTGCGGTGACCTGGCCGACTCCGAC 

92 
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GAAGGGTTACTGCGCGGTCTGGCCGCCCACCAO 



GCGGGCATGTTGCCGGCCTTCCGGCACAC 
COOWCAACATOCCC^C^GT^ 



GGCCTGGCCTCCACCCGCACCTTTCCG 



CTGCGCAGCTCGTTTGCCCCGTCGTACAACATGACG 
, T ~ -rTRGTGGACCGGATGGGTCCGCAACAGGCGCACCGACTGCTCGAGCAGTCGTTCGCC 

^^^^|^^^^^^^CGAACTGGGCGGATCTGATGCGCCCATCCTCGAATACGCT^3ATTG 



CGCGCGCGGGTGTCCGAGCTGGAACGTGCGCAGGCCCGCGCGTCGCGGTTACAGCG^A^CGGC 
CGCCGCGGTGGTCTGGCCGTCGTCCTGGAATCAGCCCGCGACCGCGACGACCCGCGTCCGCT 



10 AGGCGGCCACCGAT^C^CT^GCCGCGCTGCGCCGCGGTGACATCATCAC 



====== 



15 



CCTGGCCTCGGCGCTGCGATCGGCAGO 
GCGAGGCCGGCGGGTTTCACGATCCGGAGCT 
CCGGTGCATACCTCGCCCGGGCTCGAGGACCA 
GAACGCGACAACGCGCAATTAGAGAGGAAGGTCGCO 



CGCGGGTCTGGTTATTCCAGCCGCCCGGCGCGTCA 
GGAGTCGTCGCGCGAACAATTGCGCCGTCAT 
GATCCGCCAGGCCGAGCGTTACTTACGCATC 



20 



GCCGCCACCAACTCGTTGGCCCGCAC 

=====2^— ==== 



TTTAAAGCCGGCCGAATTGGCGGGGGTGGTG 



GCCCACACCGCGGTTACGGCAGGCTCTGA< 



25 



ACGAAGTTCG^ACGCTGCGCCCAACCCCGAACTGCGGGCTACCGCAAAGCGCGCTATCGGTG 
ACATTCGGCGCGGCGTCGTCGCGGTTGACGCCGGGTAG 

30 »Rv2101 he* he lta a«.Sn raR ad54 ftmll yTB.se q2 360 2 38 :2 363276MW:111632 

. ~™ .^^r^^nr.ATACATCCGGGCAAACCCGCAACCGCCGTTTTGCTGTTGCC 

^GCTGATCCGGCTCGCCCCGCGCCCGGCCG 



GCG^G^^AGCCCGCCCCCGACGTCCGCTAGGGCGCGTCCGT^ACTACC^ 

93 
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GGCCGAGCTGGCCGTTTTCGCGCGCGAGTTGGTCGAGCGTGGTCGCGTGCTGC^CAGCTGC 

GCCGCGACACCCACGGCGCGGCCGCCTGCTGGCGTCCGGTGTTGCAGGGACGCGACGTGGTC 

^GATG^CCTC^CTGGTCTCGGCGATGC^GCC^GTCTGCXGCGCCGAAGTTGGTGGGCACGA 

CCCGCACGMCTGGCAACCTCGGCTCTGGACGCGATGGTCGACGCCGCCGTGCGCGCGGCGC 

TGTC^CGATGGACCTGCTGCCCCCGCGACGGGGTCGCTCCAAACGGCATCGGGCCGTGGAG 

GC^G^CTG/^CGCGTrGACCTGCCCGGACGGCCGGTTCGACGCGGAGCCCGACGAACTCGA 

CGCG^^G^CGAGGCGTTGCGGCCATGGGACGACGTCGGTATCGGCACCGTCGGCCCGGCGC 

GGGCGA^iTTrCGGCTGTCCGAAGTCGAGACCGAAAACGAGGAGACGCCCGCGGGCTCGTTO^ 

TG^G^G^C^GGAGTTCTTATTGCAGTCGACGCAGGACCCCAGCCTGCTGGTCCCCGCCGAGCAG 

g^ggaILcgacggcagcct^ 
cgaIctg^cgggcctctcg^ 

cgtc^cttgagctcgacgccgacggcgcctaccgattcctgtcgggtagggocgcggtg 

CTTO^G^GGCTGGGTn'GGCGTGCTGCTGCCGTCCTGGTGGGACCGCCGCCGCAAGCTGGG 
C^^TCCGCATATACCCCGGTCGACGGCGTGGTGGGCAAGGCCAGCAAGTTCGGCCG 



CGAGCAGCTCGTCGAGTTCCGCTGGGAGCTGGCCGTGGGCGACGATCCGCTCAGCGAGGAGG 
AGATCGCGGCGCTGACCGAAACCAAGTCCCCGCT 



rGATCCGGGTGCGTGGCCAGTGGGTCGCG 



.CCGCCGAGATOCTCGCGCTGGCCGCCAGCCACOCCGACGACGTGGACACCCCGCTCGA 



15 

AC 

CTCGATACCGAACAGATGCGCCGCGGGCTGGAG 

^^s^/-»/-»r» Ar*ATrrTrr5r.RnTG_ 

GGTCACCGCCGTACGCGOCGACGGCTGGCTCGGGGACCTGCTCGCCGGGGCCGCCGOGGCG 

20 TCGCTGCAGCCGTTGGACCCGCCO 

JCCTGGAAACCTTGGAATCCGTTCAGCGCCACCAG 



GGAGCGTAAGCCAACCGGCCGCAAG 

. — ^^»^^»/^^^oT«ftr , of5r > .r!AGCCACC 

ACCAC 

GACGGATTCACCGCGACGCTGCGTCCCTACCAGCAGCGC 
GGTCTGGCGTGGCTGGCGTTTTTGTCCTCGCTCGGTTTGGGCAGCTGCCTGGCCGACGACATG 

GAAGCGG^CAG^^^GCACCCAACCTGCGGGTGTACGCCCACCACGGGGGCGCCCGG^TGCA 



CGGCGAGGCGTTGCGCGACCACCTCGAGCGCACCGACCTGGTCGTGAGCACCTATACCACCG 
CCt 

CA<*^ U , — ~ GAACCGGCTCGCCGAGCTGTGGTCGATCATG 



" CCACCCGCGWDATCGACGAGCTGGCGGAATACGAATGGAACCGGGTGGTGCTGGACGAGGCC 
VG^T^AACAGCCTGTCCCGGGCGGCCAAGGCGGTGCGACGGCTACGCGCGGCGC 

CCTGCGCCGGCTCAAGACCGACCCGGCGATCATCGACGATCTGCCGGAGAAGATCGAGArTCAA 
^ACTGOCAACTCACCACCGAGCAGGCGTCGCTGTATCAGGCCGTCGTOG^GACATGAT 

gg^tgg^ 

^CTCAAACAGGTGTGCAACCACCCCGCCCAGCTGCTGCACGATCGCTCCCCGGTCGGTCGGC 
35 GGT^GGGAW^3TGATCCGGCTCGAGGAGATCCTGGAAGAGATCCTGGCCGAGGGCGACCGG 
GTGCTG^GPnRACCCAGTTCACCGAGTTCGCCGAGCTGCTGGTGCCGCACCTGGCCGCACGC 
TTCGGCCGTGCCGCCCGAGACATTGCCTACCTGCACGGTGGCACCCCGAGGAAGCGGCGTGA 

94 



BNSDOCID:<WO 0135317A1 I » 
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GGGCGGTACCGGGCTGAACCTCACCGCCQ gatcqggCAGCGGCGCACGGTG 



CGCGATCTGCGCGAGGTGTTCGO 



GCTGTCCGAAGGCGCCGTCGGTGAGTAG 



TATCTTCTTTCACTGACTTCCTGCG tgcg ^ tgcog ^ GCACCACCATTGTC GCGCTG 



^GCGCCACTCGCAGGOGGCGA^ - ^ c8TTCOACG cA Q GGCAACATGATrTCT 

CGCCGCCGGCGGTTGGAACATCGAG 
rCGCGAAGTCGTCGATGAAGAAGTTG 



CTGGCGGCCGCGATGCAGGGTCTGCTGGCGTTGC — ^ ^ ^qqqqqqqgcGGTTGGAACATCGAG 
GAAGAGGG^TATCAGGCGGTGGGCTCGGGTTCGCTGTTC 



TCTGACCCGCAGAGCGCGGGTCGTATCGTTTCGTTCGACG 



20 ^^^^^J^^^^^^^^^q^^^CCG^ACCTGGTGCGGGGCATCTTTCCGACGGCGGT 

iCCGACGGGGCGGTTGACGT 
CGATCATCGAAAGCCGTTCGGGTGCGGATA 



GGTGGCGGTCGAGGCGCTCTACGACGCC 



25 >RV2118C - = B2126.C1_165 (83.6%) TB.seq 2377471:2378310 MW:30091 
1 i^ 56 .MTBH37RV c2378310-2377468. Rv2118c SEQ ID NO:70 

CCAAAGATGCGGCCCAGATCGTC ynTGCTGCGGGCGGTTGGGCCGGCCGGACA 



GCAGGAGCCGGATCCGGTGCTCTGACCTTGTC 



iTGCCGAACACGCCCGGCGCAATGTGAGCGGCTG 



CGTCAGCGACCTCGCCGACTCCGAACTGC 



GGTGATCTCCTACGAACAGCGCGCCGATCAT 

GTATCG C ^,v^, w ™ VGTGCTGGACCGAACCGAGAGCCTGGGAGAC 



VTCGCGGCTGCTGGTCGCCGGCGGAGTGCTGATGGTCTACGTGGCCACCGTCACTCAGCTG 

TCGAGGATCGTGGAGGCACTGCGGGCCAAGCAC 

95 



BNSDOCID: <WO 0135317A1 
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.OCGOOOCTOOAACGTCOTAGOOTTGOCGGTTCGOCCOCAOCATTCGATOCOCOOaC 



ggtcgt; 



fAAGCGCGAGGGACGCGACGGGTAG 



>Rv2144c - TB.seq 2404166:2404519 MW:12028 



>Rv2144c- TB.seq Z4Wioo;^wi* 

CCAGCAACCAGCT^TGGCCTGGGTATGC^rcG^^^^^^^^^^^^^^^^^^^ 
TCGTCGATGCG^AG^C^GC^GG ^^^^cc^GC 
^GTCGACGCCGGTGTG^r^3G^^^^^^GCCATCGGAGGAGGCCAGCGAAGCGACCGAGG 
AGTCGGCGGTATCGGCGGACCGAAGCGACGACAGCGCCAAGTAG 



>Rv2146c - mse, 2405667:2405954 MVWOBOS 

^<u«IMTBH37RV'.c2405954-2405664, Rv2146c SEQ ID NO.72 
> emb ^ 12M ^™™'^ T " GGG ^cocGCTGTTCATCTrCTGGCTGCTGCTGATCGCTCG 
TTGGTGGTGT^^GATC^GGGTTCG TCCCACCGGTGTCACCGTOGT 

^CGTCGTTGAGTTCATWGCT^TTC TGAAGGTGC TGCGCCGGCTGATCCC 

^^^^^^^^^^ 
CATCGGTATGCAACTGGCGTTTGGTGCTGCGGCCTGA 

>Rv2147c-TB.seq 2406119:2406841 MW:27630 

<^fi.MTBH37RVc2406841-2406116. Rv2147c SEQ ID NO.73 

^s============== 

^I^^A^GGCAGCCCGGTCATCATGGATCTGGTGTOGATGGACAACGCCGAT 

™SctgctSat^cggggccggcgtggccttcgcgctgcgcggctcgttcgao^ 

S^^^COTGCTCTCGCCTGCAGACGTCGATGTGTCCCCCGAGGAGCGCOG 
kGGATCGCCGAAACCGGGTTCTACGCCTACCAATAG 



CA 

>Rv2148c- TB.seq 2406841:2407614 MW.27694 

96 



0135317A1 I > 
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>emb 



ATi 



GGCGGCGGATCTTTCGGCGTATCCAGA( 
GCGATCGCGACTTGCGGCGGCCGCGGAGGCG' 



10 



15 



GCGGGTCGCAATGTCGGCGAAATTGAACT 
ATGCGATC^mv. . , -~ ' _ a xr TTGCGATTTTGTTTCG ATTGGGTTGTCGGTC 

^C^OGOTCACTCGGGGGGTGTOCACTGGCACATC^TGGOCC^ATT^C^ 
^^^^CGCTGGCTCGCTGGGCGCACAGCGCTCACTCGGTGGACAGCTCGCGGTTG 

======= 

CGCGGCGGTTACGGTCACCGTGA 

>Rv2150c fteZ TB.seq 2408386:2409522 MW:38757 



AACGO 



CGTCAACCGAATGATCGAGCAGGGCCTCA 
U3GCGTTGTTGATGAGCGATGCCGACG-P 



CAAACTCGACGTCGGCCGCGACTCCACC 



20 o * co, ?^£^^ 

ACGAGATCGAAGAGCTGCTGCGCGGTGCCGAC, 



CGGGGGCTGGGOGCCGGCG<^^G =GOTGmGT ^ ccG ^^ 
GGAACCGGCACCGGGGGGGCAOCCGTCGTCGCCAGCATCGCCCGCAAGCTGGGCGCGTTGAC 



CGTCGG^GGTCACCCGGCOGTTCTCG^ 
25 TGGCATCGCGGCGCXGCGGGAG^ 



GCAGATGGGAGATGCCGCGGTATCGCTGATGGAT 



CAACGGCGTGCAGGGCATCACCGACC^ 



CGACGTCAAGGGCATCATGTCCGGTGCCGGCACCGCA 



GCGAAGGCCGGTCGCTCAAAGCGGCCGAGAT 



CGCCATCAACTCGCCGTTGCTGGAAGCCTCGA 



30 



rGGAGGGCGCGCAAGGCGTGCTGATGTO 



GATCGCCGGCGGCAGCGACTTGGGCTTGTTCGAG 



ATCAACGAGGCGGCCTCGTTGGTACAA 1 



GACGCCGCTCACCCCGATGCCAACATCATCTTCGGC 



^^G^G^TC^^CCGCAAGCCGi^GATGGGCGAGACCGGCGGCGCCCAGCGGATCGAGT 



GTGATCGACGATTGGCTCGGTGACGAGGTGCGGGTGACCGTGATCGOGGCCGGCTTCGAC 

3A1 

TCATGCGCCGCTGA 

97 



BNSDOCID: <WO 01 1531 7 A1 I » 



WO 01/35317 
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»Rv2152c muiC TB.seq 2410639:2412120 MW:51146 
«mblAL1234S6|MTBH37RV:c2412120-2410836.n,uiC SEQ ID NO:76 

gTg^caccgagcagttgccgcccgatotgcggcgggtgcacatogtcggcatcqgcggagc 

TGGCATGTCGGGCATCGCCCGAATCCTGCTGGACCGCGGCGGGCTGGTCTCCGGGTCAGACG 

^CAAGGAGTCGCGCGGTGTGCATGC<3CTGCGGGCGCGGGGCGCGTTGATCCG<3AT(^GAO^C 

G^GCGTCGTCGCTGGACCTGTTGCCCGGTGGCGCCACGGCGGTCGTCACTACCCATGCCGC 

CATCCCCAAAACCAACCCCGAGCTCGTCGAAGCGAGGCGCCGCGGCATTCCCGTGGTGC1X3CG 

GCCGGCCGTGCTGGCCAAGTTGATGGCCGGGCGCACCACATTGATGGTCACCGGCACGCACG 

GCAAGACAACGACGACGTCCATGCTGATCGTCGCCCTGCAGCACTGCGGGCTTGACCCGTCCT 

TTGCGGTCGGCGGTGAGCTGGGGGAGGCCGGTACCAACGCCCATCACGGCAGTGGCGACTGT 

TTCGTCGCCGAAGCCGACGAAAGCGATGGCTCGCTGTTGCAGTACACACCCCACGTCGCGGTG 

ATCACCAACATCGAGTCCGATCACCTGGACTTCTACGGCAGCGTCGAGGCGTATGTTGCGGTGT 

TCG^^TCCTTCGTGGAGCGCATTGTCCCCGGGGGTGCGCTGGTGGTGTGCACTGACGACCCCG 

GAGGGGCCGCGCTGGCTCAGCGCGCGACTGAGCTGGGAATTCGAGTGCTGCGATACGGGTCG 

GTGCCGGGTGAGACCATGGCAGCCACGTTGGTCTCGTGGCAGCAACAGGGGGTCGGCGCGGT 

CGCACATATCCGGTTGGCCTCAGAACTAGCCACAGCACAGGGTCCCCGCGTGATGCGGCTGTC 

GGTO^CCGGGCGACACATGGCGCTCAACGCGCTGGGAGCGCTGCTGGCCGCGGTGCA^^TCG 

GCGCCCCGGCCGACGAGGTGCTCGACGGGCTGGCCGGCTTCGAAGGAGTGCGGCGACGATTC 

GAACTGGTTGGGACCTGCGGCGTCGGAAAGGCGTCGGTGCC^TGTTCGATGACTACGCCCAC 

CACCCGACGGAGATCAGCGCGACACTGGCGGCGGCGCGCATGGTGCTCGAACAGGGCGACGG 

TG^^CGCTGCATGGTTGTGTTTCAACCCCATTTGTATTCGCGGACAAAGGCATTCGCTGCTGAG 

TTTGGGCGTGCGCTGAATGCCGCTGACGAGGTGTTCGTACTCGACGTCTACGGAGCTCGTGAA 

CAACCGCTGGCCGGTGTCAGCGGAGCCAGCGTCGCTGAGCACGTCACTGTGCCGATGCGCTA 

^cccgIatttttcggcggtcgcaca^ 

CGTCACGATGGGTGCCGGAGACGTGACCTTGCTGGGCCCGGAAATCCTGACCGCCCTTCGGGT 
CCGGGCCAACCGAAGCGCCCCCGGCCGTCCGGGGGTGCTGGGATGA 

♦ 

>Rv2153c murG TB.seq 2412120:2413349 MW41829 
> em blAL123456|MTBH37RV:c2413349-2412117,mutG SEQ ID NO:77 

gtXggacagggtcagccagccggcoggcgggcgcgggggaacggcgccccggcocgccg 

ATGCCGCCTCGCCGTCTTGTGGTTCCTCGCCGTCTGCTGATTCCGTGTCGGTCGTTCTCGCCGG 

cggcgggaccgccgggcacgtcgagcccgccatggccgtcgccgacgccttggtcgcgttgg 

ATCCGCGCGT(XGGATTACCGCGTTGGGCAC^CCGTGGACTAGAGACCAGGCTGGTGCCCC 

agcgcggctaccacctggagctgatcacggcggtgccgatgccgcgcaagcccggcggcgac 

CTGGCCCGGCTGCOGTCGCGGGTGTGGCGCGCCGTCCGGGAGGCCCGGGACGTGCTCGACG 
ATGTCGACGCCGACGTCGTCGTCGGTTTCGGTGGGTAOGTCGCGCTACCGGCTTACCTAGCCG 
CTCGCGGCCTGCCTTTGCCGCCCCGGCGCCGGCGCCGGATOCCGGTGGTGATCCACGAAGCC 

98 



0135317A1 I > 



PCTAJS00/31152 

WO 01/35317 

AACGCCAGGGCGGGACTGGCCAACCGGGTCGGCGCCCATACCGCGGACCGGGTGCTCTCCGC 
^^GCCGGMTCCGGGCTGCGGCGCGCCGAGGTGGTTGGGGTCCCGGTCCGTGCGTCGATCG 
CCGCGCTGGAC^GCGGGGTGCTGCGAGCCGAGGCGCGGGCACACTTCGGCTTCCCCGACGAC 
GCGCGG^^CTGCTGGTGTTCGGGGGTTCGCAGGGCGCGGTCTCGCTCAACCGGGCGGTGTC 

^SgccS^tggccgccgccggtgt^gggtgctgcatgcccatggaccccaga 
5 S t ^^g^^gtcgg<^tcaaggtgagccaocgtacgtggcggtgcccta^gg 

ACCG^lTGGAGCTGGCCTACGCCGCCGCCGATCTGGTGATCTGCCGGGCCGGGGCGATGACG 

gt^cg^gwccgccgtcggtctgccggccatctacgtgcxgctgccgatcggc 
g^cagcggctgaatgcgttgccggtagtcaatgccggcggcggcatggtggtcgccgacgc 

r^CCTGACCCCCGAGTTGGTGGCCCGCCAGGTTGCCGGGCTGCTCACCGACCCCGCGCGGC 
10 TGGC^CG^GACCGCGGCCGCAGCCAGGGTGGGACATCGCGATGCCGTOGGCCAGGTGGC 
CCGGGCCGCGCTGGCCGTCGCCACCGGGGCCGGTGCCAGGACAACGACGTGA 

»Rv2154cftsW TB.seq 2413349:2414920 MW:56306 



<«4«IMTBH37RVc2414920-2413346. itsW SEQ 10 NO:78 
" ^GCT^uA^OM^G^TOCGTCGGGGCACCAGCGACACCGACGGCTCCCAGACTCGAGGGGC 

CGAGCCGGTCGAGGGGCAGCGGACGGGCCCG< 
CCCCGCACCCGTTTCGGTGCCTGGCTGGGCCG 



CGAGCCGGTCGAGG TCCGATGACCTCGTTTCACCTCATCATCGCC 
^ornrATTGCTGACCACCCTTGGACTGATCATGGTGCTGTCGGCATCGGCGGTGCGGTCC 

- ======== 

====== 



TG 



GCGTTCGCCATCTGGGGAGCGCATCT 



GCTGGCCGCCCGGCGCATGGAACGGGCTTCACTG 



25 



CGCGAGATGCTGATTCCACTGGTGCCGGCCGCCG 



TCGTTGCGCTGGCGCTGATCGTGGCCCAG 



CCCGACCTCGGACAGACCGTGTCGATG 



GGCATCATCTTGTTGGGCCTGCTGTGGTATGCGGGG 
rCGTCGTCTCGGCCGCCATCCTGGCG 



CTGCCGCTGCGCGTCTTCCTCAGCTCACTGGCGGCGGT 
GTGTCCGCGGGCTACCGAT< 

GACTCCGGCTACCAGGCCO— — - _™^ AA r^r.TTnATTTTCG 



GTCCGCGGGCTACCGATCCGACCGGGTGCGGTCGTGGCTCAACCCCGAAAACGATCCGGAA 
OGTOTGGGCCAAGGCGTGGOCAAGTGGAACTACTTGCCOAACGCCCACAACGACTTCAT 



^^^^'^^T^CGCTGGCrCAAGGTGOCATTTTCOGCGAC 



CCATCATCGGCGAAGAGCTG rAGCCGGTCCGCCGACCCGTTCCTGCGGCTGCTGACCG 

VGGCGTTCATCAACATCGGCTATGTGATCGGGCTGC 



TCGCCTACACCGGCATGCGCATCGCT 

TCTCCGCCGGTGGAACCTCCACGGCCGCAACAC 



CCACCACGACACTGTGGGTGCTGGGACAC 



TGCCCGTCACCGGCCTGCAGCTGCCGCTCAT 

GAAGCGCGCCAACCCGCAACCGGCCCAAACGCA 



^^^ATAGGCATCATCGCCAACGCGGCTCGCCACGAACCGGAGGCGGTGGCCGCGCTG 
CG^G^G^ 



35 

GCCCCCTCGTCTCGAGGCGTTTCGTGACCG 1 



99 
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CGGGTGA 

>Rv2155c murO TB.seq 2414935:2416392 MW:49314 

-.o^rrimtrh37RVx2416392-2414932. murO SEQ ID NO:79 
^^C^T^^GGOTGCGCCCGTG^^TAGCCGOTOGCCOGGTGACC^ 



AGATAACCGGGTATGCGCTGGTGGTCGCCAGTCO 
CCG 

cgcagcg^ , , 7—7^ 

CCACCACGACGTCGATGCTGCACGCCATGCTGA^^^^^^^^^^^^^^^^^^^^ 



15 



GGCAATATCGGCAGTGCGGTGCTGGATGTGCT 



20 



CTGGGCGTGCGCGACGCCCACCTGGTCGATCGO 



GTCGCGTCGATACCGGTGCCAGGTi 



CCGGTCGGCGTGCTTGACGCCCTGGCCGCGGCGGCGCT 
JCGCCGACGCGGTCACGTCGTTTCGAGTGG 



VGGTGGTGGCCGTTGCCGACGGCATCACCTACGTGGACGACTCCAAG 



25 



CGGTGGCCTGCTCAAGGGCGCGTCGC 



GGCCCGCTCGGTCGGGGTGCCCGCCGGTGCGAT 

GGTGCTGATCGGCCGGGA 
CCCGATGTCCCAGTCGTTCAGGTTGTGGCAGGCGAC 

TTGCTTGTGTTCTAGATGT< 

CCGTGATGACCGCTGCGGTGGCCGC - qcGGTTATGCCGACCGGGGCG AGGCATTC 



^3GTTGCCGAGGCGTTATCACGACACGCG 



— ==== 

30 CTGGCACCGGCCGGCGCCTCATTCGACCAGTTCAi 
GCGACCGCGGTCCGCGCGGTGATCCGGTAG 

>Rv2156c murX TB.seq 2416397:2417473 MW:37714 

<^e;fiiMTBH37RVc2417473-2416394. murX SEQ ID NO:80 
ATfy^^CAG^TCC^AK^CCGTTGCCGTAGCGGTGACGGTGTCCATCTTGCTGACCCCGGTG 

^t^g^actLcagggcttcggccaccagatccgtgaggatggcccgcccagc 

^CCACACCAAGCGCGGTACGCCGTCGATGGGCGGGGTGGCGATTCTGGCCGGCATCTGGGC 

100 



35 
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^^g^acgt^gaga^ 

v2gtgctgatcaccttctggcagt 



GCGCGGATU , » , ~ . ~- " ~ - ^qcqqtc^ttcacCGATGGCCTGGACG 
TGTTCTGCGTGGT^T^GC^T^^ ^ gct ^ t ^ ctggcagt 

ggctggccgccggcaccatggcgatggtca gtgcgcgacccgctggac 



CTGGCGCTCATCGCGGCCGCAACCGCTGGCGCCTGCATCGG 



CTGGGTCGCTGGCGTTGGGCGGCGTCATCGCGGGGTTG 



ACCGCAACGCGTGCGTGACGGCGCCGGGCCTG^^^^^^^^^^^^^^^^^^ 

10 CCCGCCAAGATCTTCAT^^ 

TCGGTGAOCA^^G^^TG^G^^^^ TGm ^ AT ^ 

A=CTCGGTGGTGTTG^TCCTGACCTT^ ^cc^ggtcatCATCCGGTTCTGOC 

15 TCGGTGCCTGA 

>Rv2157c murF TB.seq 2417473:2419002 MW.51634 

uiai ^^fiiMTBH37RVc2419002-2417470,murF SEQ ID NO.81 
ATGA^rcGAG^TGA^CGTCGCGCAGATCGCCGAGATCGTCGGGGGCGCAGTGGCCGATATCTCC 



.CCGGGACCGTCGAGTTCGACTCGCGCGCCATCGG 



TCGGCGGTAGCCGCG<^CGCCGCCGTCGTGCTGGOTOO^TOCCG^TGGGG^^G^CG^C^ 



CCGCAAGACGCCGCGCACCGCCGCGTCA 
CCCGGGCGGGCTGTTCCTCGC 
rCGGCGGTAGCCGCGGGCGC 
rCGTGGTTCCGCCAGTGGCCC 
GGGTCGGGGGCGGCGGTGC1 
TGGTGGCCGGCGGGCTCACC 



20 <X^— SCCCTGCCGGGGGCGCGCGCCGACGGCCACGACCATGCCGCG 

CCGCCGTCGTGC 
TCGTGGTTCCGCCAGTGGCCGCGCCGAACGT, 



GGAGATGGCGGCACGCCATCACGGCAACA 



JCGCCGCGCTCGCCGAGATCGCGCCCCCGTCGA 



TCGGAGTCGTGCTCAACGTCGGCACCGCACA 



TTTGGGTGAGTTCGGCTCCCGCGAGGTCATCG 



TTCCGGAGCGGTCGTCCTCAACGCTG 



30 ^^^^^^^^^ 

ATGACCCCGCGGTGtoU lTGTC GCTGGACGAATTGGCCAGGCCGGGCT 



GGACAACACCGGTGACGTTTGGGCGGGGCCGG1 



W3GTCCGACTCGGGGTCTGCGGCGACCACCAG 

^C^G^^G^GG^GACC^CGGCGC^CCGGTGTCGCGGCATCGGATGCAGGTGACCACCCGC 
" GGCG^G^GGTGAC^GTGATCGACGACGC^TACAACGCCAACCCCGACTCCATGCGGGCCGG 
GCTGCAG^Q^GCTGGCCTGGATCGCGCACCAACCCGAGGCCACCCGCCGCAGCTGGGCGGTGC 

101 



BNSDOCID: <WO 0135317A1 I > 
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GATGCATTGGTCGCAGACGACACATGCGGGAGTGTGCGCCCATGA 

>Rv2158c murE TB.seq 2419002:2420606 MW:55310 

u.Ano^«;RiMTBH37RV c2420606-2418999.murE SEQ ID NO.oz 



10 GTGTCA 



1S OACC^CCCCOCCOOGOTCOCC^^O^^^^^^ 



CGCOGTCGTGGGCGTTCGGTTGGCCGCACTGGCCGATCA 



GG 



GATTTCCACCCCAGCATGGCCGACTA 



CTTCGAGGCCAAGGCGTCATTGTTCGATCCGGACTCGG 



\TCGACGACGACGCCGGGCGCGCGATGGCGGC 
CCGCCGACCGGCCCGCACACTGGCGCGCCACG 



CACTGCGCGCCCGCACCGCCGTGGTGTGCAT 



GA" 



XTCTGGACACCGTCGGGGTCTCCCCGGAACA 



■ ====== 



TCGCAGACCGGCGGGACGCGATCCGGCACGCGGTTGCCTGGGCGCGCCCCGGCGACGT 
GGTGCTCATCGCCGGCAAAGGCCACGAGACCGGGCAACGCGGCGGCGGGCGGGTCCGCCCQ 



35 



AGA 

^GCTGGCTGCCGCGCTAGAGGCCCTCGAGCGGCGCGCATGA 



TTCGACGACCGGGTGGAC 
>Rv2159c - TB.seq 2420632:2421663 MW:36377 

102 



BNSOOCID:<WO 013S317A1 I > 
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«o^ c ciMTnw^7RV c2421 663-2420629. Rv2159c SEQ ID NO:83 

GGACGAGGGACTGCTCACCGCCGGC ^^ CGTax ^^ TQCGCr<3 cCcCl<iS 
tgcGTCGACGCACACACCACCATGCTGTACGCGGCAGGCCAAACCGACACCGCCGCGGCGAT 
C^GGC^GCACAGCACCTGCCGCCGGTGACCCGAACGCGCCGTATGTGGCGTGGGCGGCAG 



TGCCGCGTGGCCGCAAGGAAGCCGTCGCC 



GAACCGGGACACCGGCGGGACCGCCGGCACCGTTCGGCCCGGATGTCGCCGCCGAATACCTG 
rtG^C^GCG^'^CAATTCCACTTCATCGCACGCCTGGTCCTGGTGCTGCTGGACGAAACCTTC 

^^Sg^gcgcccaacagctcatgcgccgcgccggtggactggtgttcgcccg 
10 ^SSg^^^tcggccgggccgctccacccgcoggctcgagccgggaacgctg 

q^CWC^TC^SGCA^GGCAACACCGTCCGAGCCCATAGCAACCGCGTTCGCCGCGCTCAGC 
CAC 



.CCACCTGGACACCGCGC^ACCTCK^A^CTCGTCAGGTGGTCAGGC^GGTCGT 
QQ^^TCGT^3^y^GCGAGCCAATGCCX3ATGAGCAGTCGCTGGACGAACGAGCACACCGCCG 



AGCTGCCCGCCGACCTGCACGC^CCCACCCGTCTTGCCCTGCTGACCGGCCTGGC^CCGCAT^ 
OAGO.^— ~~ CCGCGCGGCGCATCGGCACCTGGATCGGCGCCG 



15 ^^^^,C0^CCCO.CCCr^C^^rOCO,COCT S O, 

TGGCGCCCTGGCCTGGGCCGCCTTCACCG 
CCGCCGAGGGCCAGGTGTCGCGGCAAAACCCGACTGGGTGA 



>Rv2163c pbpB TB.seq 2425049:2427085 MW.72506 



>KV£l DOC pupo i - • ^ 

-.oiA«;fiiMTBH37RVc2427085-2425046, pbpB SEQ ID NO:84 

G^^^^OC^^^CGGGAGCCCAGGAGGTTGGGCAACGCAAACG^^CCGG^AAAA^QCAGAA^G 
rnCGGCAAGCCCAGGAAGCCACGAAATCCCGCCCTGCGACACGGTOAGACGTCGCACCCGCG 

^gItcgact^tgcgaggcgcaccoggcaggtggtggacgtcgggacgggcggt^ 

Q^CG^CTn^G^CATCGGACCGGAAACGCGGTCATCTTGGTGTTGATGTTGGTCGCGGCAA^A 
GTO G ^TIir° r-iGGTATCACATGCCGCGGGCCTGCGTGCGCAGGCGGCCGGCCAACTC 
AAGGT^^^^^^^A^CCA^^GG^TCGCGGCAGCATCGTCGACCGCAACAATGACCGGCTC 
^^MyQ^^cATCGAGGCGCGTGCCCTGACGTTCCAGCCGAAGCGGATTCGGCGGCAATTGGAA 
« ftAGGCC^G^AAG^AGACGTCGGCTGCACCCGACCCGCAGCAGCGCCTGCGCGATATCGCCCA 
GGAGGTCGCC^GCAAGCTGAACAACAAGCCAGATGCCGCGGCCGTGCTGAAGAAGCTGCAAA 
rr^^AGACC^TCG^CTACTTGGCGCGTGCGGTCGACCCGGCTGTCGCCAGCGCQATCTGCG 
^r^^TGCGGAAAGACAGGATCTGCGTCAGTAOCCGGGTGGGTCGCTG 
q^GG^^^C^TC^CG^TGGCATCGACTGGGATGGTCATGGGCTGCTGGGTCTGGAGGACTCC 
„ CTGGATOCGGTG^TGGCCGGAACCGACGGATCGGTCACCTACGACCGTGGGTCAGAGGGCGT 

cgtca^cIg!^ 

CCTCG^CA^GA^ATC^AGTTCTACGTGCAGCAGCAGGTGCAGCAGGCCAAGAACCTATCGGG 



BNSDOCID:<WO 0135317A1 I > 



10 



PCTAJSOO/31152 

WO 01/35317 

r.rTCACAACGTCTCGGCCGTCOTCCTGGACGCCAAGACCGGCGAGGiTGCTCGCGATGGCCA 

ArrACW^CCTTCGACCCGTCGCAAGACATCGGGCGCCAGGGCGACAAGCAG'TTGGGCAACC 

C^G^GGTOT^T^GCCCTTCGAGCCGGGCTCGGTGAACAAGATCGTCGCCGCGTCCGCGGTC 

AT^a^^ACGGGTTGAGCAGCCCCGACGAGGTGCTACAGGTGCCTGGCTCGATCCAGATGGG 

r^^CTTACCGTOCATGACGCTrGGGAGCACGGCGTGATGCCCTATACCACCACGGGGGTGTT 

rRQ/^^TCCTCC^^TCGGCACGCTGATGCTTTCCCAACGTGTCGGACCGGAACGCTATTAC 

^^TC^C^CGGGTTGGGACAGCGCACCGGCGTGGGCOTGCCCGGTGAGAGCGC 

^^^^^^^Q^AATCGACCAGTGGTCGGGCAGTACGTTCGCTAATCTTCCTATTGGCCAA 

QQ^y^^^TOACT^TOCTGCAGATGACCGGCATGTACCAGGCCATCGCCAACGATGGAGTGC 

^^^r^Q^^^^/VTTATCAAGGCCACCGTCGCACCCGACGGCAGCCGAACCGAAGAACCGC 

^cSrG^^CGTGGTGTCGGCGCAGACCGCCCA^CCGTGCGCCAGATGCTG^T 

SgtggSca^gcgatccgatgg^^^ 
Sgc^ga^ggccggcaagaccggtaccgcgcagcagatc^ 
c^acga^gtLttggatcaccttcgccggaatcgccactgccgacaatccc^^ 



15 AT 



CGGCATCATGTTGGACAACCCGGCG 



CGCAACTCCGACGGCGCGCCTGGGCACTCGGCCGC 



CCCGCTGTTCCACAACATCGCGGGCTGGCTGA" 
TCCCGGGCCTCCTTTGGTCTTGCAGGCCACCT 



kTGCAGCGCGAAAACGTCCCGCTGTCACCCGA 



AG 



>Rv2165c - TB.seq 2428236:2429423 MW:42498 

k.ai i9^S6IMTBH37RVx2429423-2428233. Rv2165c SEQ ID NO:85 

GGTCATGTGCCGGTATTGGCGC 
ATCCAGAC 

CGGTTTTT^«— " CCGACTTACCCTGGTGCACACCCGCTAT 



25 ATCCA" 



GACGGCTCGCAGGCGGTCCTTCTCGACGCGACCATCGGCGCGGGCGGGCATGCGGAG 
^^AGG^^GCCGGGTCTGCGCCTGATCGGGCTCGACCGTGACCCAACCGCTCTG 

^^?cSg^catccat<3cagctcgacc<^cc<^g<^cgc CT ac<^ 

ACGG^^GCGCCAn^G^ATGCGGATGGACCCGACGACGCCGTTGACCGCAGCTGACATTGTC 
M AA^CTr/^^^^A^G^GGCACTAGCCGACATCCTGCGTCGCTACGGAGAGGAGCGGTTTG^r 
^^^^^^^Q-^3^^GTATCGTCCGCCGACGCGCAAAAACCCCGTTCACCTCGACCGCCGAA 
^TreCCCTOC^G^^CAGGCGATTCCAGCTCCGGCCCGGCGTGTCGGCGGGCATCCAGCC 

c 

GGGCGCATCGCGGTGCTGGCCTACCAGTCG 



CTGG1 , ^ , ^ • ~ • ~~ — - JCGGTCAACGATGAGCTGGAATCGCTGCGCACGGCC 



AAGCGAACATTCCAGGCGCTGCGCATCGC 



35 srs^^ 



ACTTCCGGTCGAACTTCCCGGCCATGAGCCGO 

104 



GATTCCGTTCGTTAACGCACGGCGCCGAACG 



BNSDOCID: <WO 013S317A1 I > 



WO 01/35317 



PCT/US00/31152 



AGCGAGTGTGGCTGAGATCGAACGCAATCCCCGCAGTACTCCAGTGCGGTTGCGGGCCCTGCA 
ACGAGTCGAGCACCGGGCGCAATCGCAGCAATGGGCAACCGAGAAGGGTGATTCATGA 

>Rv2166c - TB.seq 2429428:2429856 MW:1S912 
5 >emb|AL123456|MTBH37RV:c2429856-2429425, Rv2166c SEQ ID NO:86 

ATGTTTCTCGGCACCTACACGCCCAAACTCGACGACAAGGGGCGGCTGACGCTGCCGGCCAAG 

TTTCGCGACGCGTTGGCAGGGGGGTTGATGGTCACCAAGAGCCAAGATCACAGCCTGGCCGTT 
TACCCGCGGGCGGCGTTCGAGCAGCTGGCGCGCCGGGCCAGCAAGGCGCCACGAAGCAACC 
CCGAGGCGAGAGCGTTCCTACGTAATCTCGCCGCCGGTACCGACGAACAGCATCCCGACAGTC 
10 AAGGCCGGATCACCTTGTCGGCCGACCACCGCCGCTACGCAAGCCTTTCCAAGGACTGTGTGG 
TGATCGGCGCGGTCGACTATCTCGAGATCTGGGATGCGCAAGCCTGGCAGAACTACCAACAAAT 
CCATGAAGAGAACTTCTCCGCGGCCAGCGATGAAGCACTCGGTGACATCTTCTGA 

>Rv2197c - TB.seq 2461505:2462146 MW:22481 
15 >emb|AL123456|MTBH37RV:c2462146-2461502, Rv2197c SEQ ID NO:87 

ATGGTGAGCAGATATTCCGCATACCGGCGTGGGCCGGATGTAATCTCGCCGGACGTCATCGAT 

CGCATCCTGGTTGGGGCATGTGCCGCGGTGTGGCTGGTGTTCACCGGCGTGTCGGTGGCCGC 
CGCTGTCGCCCTGATGGACCTGGGTAGGGGCTTCCACGAGATGGCCGGAAACCCGCACACCAC 
GTGGGTGCTGTACGCCGTAATTGTGGTCTCCGCACTGGTCATCGTGGGCGCGATACCGGTGCT 
GTTGCGAGCTCGCCGCATGGCTGAGGCCGAGCCCGCGACGAGGCCGACGGGTGCATCCGTGC 
GGGGCGGGCGATCGATCGGATCCGGGCATCCGGCGAAACGCGCTGTGGCCGAGTCGGCACCC 
GTACAGCACGCGGATGCATTCGAGGTGGCCGCCGAGTGGTCCAGTGAGGCGGTGGACCGGAT 
CTGGTTGCGCGGGACAGTCGTGTTGACCAGTGCGATTGGCATTGCGTTGATTGCCGTGGCGGC 
GGCGACCTACCTCATGGCGGTCGGTCACGACGGGCCATCTTGGATCAGCTACGGGTTGGCCGG 
25 GGTGGTCACCGCGGGCATGCCGGTGATCGAGTGGCTATACGCTCGGCAGCTGCGCCGGGTGG 

TGGCGCCCCAGTCCAGTTAG 

>Rv2198c - TB.seq 2462149:2463045 MW:30955 

>emb|AL123456|MTBH37RV:c2463045-2462146. mmpS3 SEQ ID NO:88 
30 ATGAGCGGGCCGAATCCCCCGGGACGGGAACCTGACGAACCCGAATCGGAACCCGTCAGCGA 
CACGGGCGACGAACGGGCTTCCGGCAACCACTTGCCGCCCGTCGCCGGGGGCGGCGACAAAC 
TGCCCAGTGACCAGACGGGCGAGACCGACGCATATTCTCGGGCATACTCTGCCCCGGAATCCG 
AGCACGTCACCGGCGGCCCGTATGTGCCAGCCGATCTCAGGCTCTATGACTACGACGACTATG 
AGGAGTCGTCCGACCTGGACGACGAACTGGCCGCTCCGCGCTGGCCGTGGGTGGTCGGTGTC 
35 GCCGCCATAATTGCCGCCGTTGCGCTCGTGGTTTCGGTGTCGTTGCTCGTCACGCGACCACATA 
CCAGCAAACTCGCCACCGGCGACACTACGTCCTCTGCACCGCCCGTGCAGG ACGAAATCACGA 
CCACCAAGCCGGCGCCGCCACCGCCGCCACCAGCCCCACCGCCCACCACCGAGATCCCGACA 

105 



20 
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PCT/US00/31152 

WO 01/35317 

.^ArACAGACGGTCACTGTGACGCCGCCACCACCGCCCCCACCGGCGACAACCAC 

ATGGACCGCAAACGAGCTGCTGA 

>Rv2199c - TB.seq 2463234:2463650 MW:14866 

w,a, i9^6IMTBH37RVc2463650-2463231. Rv2199c SEQ ID NO.89 
>emb|AL123456|N^H37R^^^^ 

===== 

GATTGGTCTTCGAATATTACGTCGGTCCTGAGAAGCACTGA 

>Rv2200c ctaC TB.seq 2463661:2464749 MW:40449 

k.a. <.^<;filMTBH37RVc2464749-2463658 1 ctaC SEQ ID NO:90 



20 



GTGACACCTCGCGGGCCAGGTCGTTTGCAA 



GGCGCTCGCAGCAATGCTGGGGGCATTGGCCGT 
GAAGCCCTGGGCATCGGTTGGCCGGAGGGCATTACCC 



GGAGGGCCTGCCCGTGGTCTTCGACAGCTC 

C ACCGTCAGTGG ATGC AGCTGGTCG _ 5atcggggcggtgatcgcctccctGG cGGTTGGG 



- ===== 
•============ 

TTCCAAC 
ACACCG 
TTCCGG 

ACACGCATTCTGGGTGC™ C CAAGACCGGAGCATTCGTGGGCCACTGCG 



TTCCAAGCCAGAGGGCAAGGACAAGTA 



CGGCGAAGAGCTGGTCGGGCCGGTGCGCGGGCTCA 



.CAAGGTCGAGACGTTGGGCACCAGCACCGAAA 



30 ^«"c^ 

1 OCGOAOTTCTTGTTCAAGCGTGACGTGATGCCTAACCOGGTOGCAAAC 



a^gJc^^^^cgcatcgacgggaagacamcgcc^ggccctgcgggcga 
" ^^^S-tgaccacccacccgtttgatactcgccgcggtgaattggccc 

cgcagcccgtaggttag 
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HM« ■»> — ™~ "SI""" 

GTGAAGTGCTGCGTGGCTATACCCTGCCC^CGGeu ^ CQG( & TGA 



TGGACAT 

» -ACCGGGGACAGAGAA^^^^ 



VTGCGGCCGCGGTGATGGTGAACGCAT 
3CGCCGAGATCGGCATCTCCACCCAGA 
GGAATTGACGTCGACCAAGTGGATCGCATGGGGAC 



25 

>emb 



.RV2438C to YHN4_YEAST P3879S TB.^ 27*793:2737006 ^30492 



ATGGGACTGCTCGGCGGCCAATCAGGGCCCAGGGT y 
^^^^^^^OCTCOACCTOOTOACCGAATCCGCCOACCTO 



ACCCGQ^— CGCTGTCGGGCTACTCCATCGAGGACGTACTACTGCAG 
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^.^^Trr^AGGACATGTTTGTGCCGATGCCGCCCAGCGCCGAGGCGGCCCTGG 
**^^^*^!^^^QCreGC^^^CTGTCCGGCAGCCCGATCACCATCGGCCGTGCCGW3GAC 

^^^^^^qqqqaGTCAACGACGGACCTGGCCTGGGACGGTCAGACGATGATCTGGGAGAATG 
^rG^C^^^rc^TOG^VGTCCGAACGTTTCCCCAAAGGAGTGCGCCGCAGTGTCGCCGACGTTG 
^^^^^q^J^CT^^^^CGGAGCGGCTGCGGATGGGCAGGTTCGACGACAACCGGCGT^iC 
CAC^G^AACGGAATCGTTCCGGOGCATCGACTTCGCACTCGACOCACCGGCAG^GAG 

^ac^^g^gcgaggtcgagcggttcccgttcgttccggccgatccgcaacgattgoaa 

CAGG^TTCCTACG^G^CTACAACATCt^GGTGTCTGGACTCGAGCAACGGTTGCGGGCGCTG 
GACT^TCCGAAG^TCGTTATCGGTGTGTCCGGGGGATTGGACTCGACGCACGCGCTGATCGTC 
^G^^^^^GCATGGACCGCGAGGGCX^GCCGCGCAGCGACATTCTGGCGTTTGC<^TGCC 
rGGATT^GCCACCGGGGAGCACACTAAGAACAACGCGATCAAGCTGGCACGTGCGCTGGGGG 
^^CTTCT^G^^^^ATATCGGCGACACCGCTCGGTTGATGCTGCACACAATCUSGCCATCC 
O^^C^Gn^GCGAAW^GTGTACGACGTCACCTTCGAGAACGTCCAGGCCGGGTTGCGO^C 
rr A^W^CTTTKXGTATCGCCAACCAGCGCGGGGGAATCGTAGTGGGCACCGGGGACCTGTC 

SagI^^g^tcgacatacggtgtcggcgaocagatgtcgca^ 
^™^cca!gacgctgatcca^acctgatccggtgggtcatttcggcgggtgagtt 

cggtgagaIg^^^ 

^^Saggaggagctgcagagcagcgaggccaaggtcggagctttcgccctacagg 

I^^^^TACTGGGCTACGGATTTCGCCCGTCGAAGATTGCGr^TGGCCTGG 
^^^^y^O^CGMG^GGAGCGGGGCAACTGGCCGCCCGGCTTCCCAAAGAGCGAACGCCC 

G^OTC«GOGTTCGGCATTGCCCAACGGCCCCAAGGTGTCCCACGGGGGCGCGTTGTCGC 
qq^CTG^GG^^GCGGGCCCCGTCGGATATGTCAGCGCGAATCTGGCTCGATCAGATCGACC 

GTGAGGTGCCCAAGGGCTAG 

>Rv2439c proB g.utamate 5-kinase TB.seq 2737118:2738245 MW:38789 
-i-3-aA«;RiMTBH37RV c2738245-2737115, proB SEQ ID NO:93 

^^Igaj^tw^^^^^gacgcaatccggaccgcgcgcggccttgtcgtgaaggtcgggag 

^^^GC^Cn^^^^GGTCCGGGATGTTCGATGCCGGCCGGCTGGCCGGACTGGCCGAQG 

cgI^gIg^ggUaag^ 

S^gaIcggctcgggctgtcccgtcgtcccaaagatctggcgacoaagcaggcg^ 
gScag^tcgggcaggtcgcgctggtgaactcgtggagcgcggcgttcgcccgctacg^ 

gcac^ggccaggtgctggt^^ 

^aaggcacgctggatcggctgcgogcgttgcacgcggtggcgattgtcaacgagaacga 

^^ctggccaccaacgagatccggttcggtgacaacgatcggctgtctgcactggtggcgca 

^^^^cgctttggtgctgctgtcggacatcgac^octctacgactgcgaccc 
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SS===S5===========S 

^TTCT^^TTATQCCGCCGAA^AACCGGCGCACTGACTCTCGACGCCGGTGCGGTG 

^^CGGCGATGTGGTOGAACTGCGTGCACCCGAOGCGGCCATGGTAGCCCGCGGGG 
TGG^^CCTACG^CGCGTCCGAGCTGGCCACCATGGTGGGCCGGTCCACCTCTGAGCTACCCG 
G^GAGCTGCG^CGCCCGGTGGTGCACGCCGACGATCTGGTCGCGGTGTCGGCGAAGCAAGCT 

AAGCAAGTTTAG 



>Rv2440c obg Obg GTP-binding protein TB.seq 2738248:2739684 MW:50430 

k.a, i^s6lMTBH37RV c2739684-2738245. obg SEQ ID NO:94 
>emb|AL123456|MTBH37RV.c27 \TCCACACCAGAGCGGGTTCGGGCGGTAACGGCTGC 

kGCCGCTGGGCGGCCCCGATGGCGGAAATGGCGGCCG 



GTGCCTCGGTTTGTCGATCGGGTCGTCAT 

CCGCA^i^^GGTOGCTTCGGGCAAGCACGGGATGGGCAATAACCGCGAGGGGGCCOCCGG 



GATTTGGAAGTGAAAGTTCCCGAAGGCACCGTGGTATTGGACGAGAACGGCCGGCTAGT 
GGCX^ACC^GT(^GCGCGGGCACCCGCTrrGAAGCCGCCGCCGGAGGC^TG^CGGT^GG 
GCAACGCCGCGCTGGCTTCCCGCGTGCGTAAGGCCCCCGGTTTCGCACTCCTCGGCGAAAAGG 

GACAGTCOCGAGACCTOACC^GGAA GA1TTCGGCQGCCAAGCCGAAGATC GGCGACT 



CGCGC 

20 ^^^^Tc^cGAGACCTCACCTTGGAACTCAAGACCGTCGCCGACGTCGGCCTGGTCGGGTTTC 
jqq^^q^^qT^^^^SG^TTGATCCCGGGCGCATCCCGGGGCCGTGGTCTGGGGCTGOACTTT 

^^Sgacgcggctctgggcgatctcgccgcacggccgcgtgcggtggtcctcaaca 

A^^^^n^TOC^GGAGGC^^GCGAGCTCGCGGAGTTCGTCCGTGACGACATCGCCCAGCGC 

^gJc^tg^cTgcgtgtg^ 

- CTG^CG^AG^MTCSAT^TCGGACTACAACGCTGCGCGGCCGGTGGCGGTGCCACGGCGGCCGGT 
30 CTGTCGCAGATBAIV-io rACCGT CGAACCCGACGGGCATGGTGGCTr 



CTGGTGCATGTGGTGGATTGCGCTACCGCCGAGCCG 



^CA^ 
CGTCGGC^TCTCGCCGACCGGCTGGCGCGCCTGGGTGTCGAGGAGGAATTGCTGAGGCTC 

CGGC JGGCGAGATGACGTTCGATTGGGAGCCGCAAACG 



GTGCGCGGTCAGGATGCGCGGTGACCATCC 



IGGGGCACCGATCCGCGGCTGGACAGCAACAA 



35 QQQ^^^^^^^^^^^^W^^AA^G^C^^CGGAGTCGGCGTCGCGAACACGQGGATGGC 
TGA 
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15 



>R V 2441C rpmA SOS ribosomal protein L27 TB.seq 2739773:2740030 MW:8969 
> mb|AL123456|MTBH37RV:c2740030-2739770, rpmA SEQ ID NO:95 

ATGGCACACAAGAAGGGGGCTTCCAGCTCGCGCAACGGTCGCGATTCCGCCGCCCAGCGGCT 
GGGGGTTAAGCGGTACGGCGGCCAGGTCGTCAAGGCCGGCGAGATCCTGGTCCGCCAGCGCG 
GTACCAAATTCCATCCCGGCGTCAACGTCGGGCGTGGCGGCGATGACACCTTGTTCGCCAAGA 
CGGCCGGGGCGGTCGAGTTCGGCATCAAACGCGGACGTAAGACGGTGAGCATCGTCGGTTCG 

ACCACTGCCTGA 

>Rv2442c rplU SOS ribosomal protein L21 TB.seq 2740048:2740359 MW:11152 
>emb|AL123456|MTBH37RV:c2740359-2740045, rplU SEQ ID NO:96 

ATGATGGCGACCTACGCAATCGTCAAGACCGGCGGCAAGCAGTACAAAGTCGCTGTCGGAGAT 
GTGGTCAAGGTCGAAAAGCTGGAATCCGAGCAGGGGGAGAAGGTGTCCCTGCCGGTGGCTCT 
GGTTGTCGACGGCGCCACCGTCACCACCGATGCGAAGGCACTGGCCAAGGTCGCGGTGACCG 
GTGAGGTGCTCGGGCACACCAAGGGCCCCAAGATCCGTATCCACAAGTTCAAGAACAAGACTG 
GCTACCACAAACGGCAGGGACACCGTCAGCAGCTGACGGTCCTGAAGGTCACCGGCATCGCAT 

AA 

>Rv2448c valS valyl-tRNA synthase TB.seq 2747596:2750223 MW:97822 
20 >emb|AL123456|MTBH37RV:c2750223-2747593. valS SEQ ID NO:97 

ATGCTGCCCAAGTCGTGGGATCCGGCCGCGATGGAGAGCGCCATCTATCAGAAGTGGCTGGAC 

GCTGGCTACTTCACCGCGGACCCGACCAGCACCAAGCCGGCCTATTCGATCGTGCTGCCGCCG 
CCGAACGTGACCGGCAGCCTGCACATGGGCCACGCGCTGGAACACACCATGATGGACGCCTTG 
ACGCGGCGCAAGCGGATGCAGGGCTATGAGGTGCTCTGGCAGCCGGGCACCGACCATGCCGG 
25 GATCGCCACCCAGAGCGTGGTCGAGCAGCAGCTGGCGGTCGACGGCAAGACTAAAGAAGACCT 
CGGCCGCGAGCTGTTCGTGGACAAGGTGTGGGATTGGAAGCGAGAGTCTGGCGGTGCCATCG 
GCGGCCAGATGCGCCGACTCGGTGACGGGGTGGACTGGAGCCGCGACCGGTTCACCATGGAC 
GAAGGTCTGTCGCGGGCGGTGCGCACGATCTTCAAGCGGCTTTATGACGCCGGGCTGATCTAT 
CGGGCCGAGCGGCTGGTCAACTGGTCGCCGGTGCTGCAGACCGCGATCTCCGACCTCGAGGT 
30 CAACTACCGCGACGTCGAAGGCGAGCTGGTGTCGTTTAGGTACGGCTCGCTTGACGACTCGCA 
ACCCCACATCGTGGTCGCCACCACCCGGGTCGAGACGATGCTGGGCGATACCGCGATCGCCGT 
CCATCCCGATGACGAGCGCTACCGTCACCTGGTCGGCACCAGCCTGGCGCACCCATTCGTCGA 
CCGGGAGCTGGCCATTGTCGCCGACGAGCACGTGGACCCTGAATTCGGCACCGGCGCGGTCA 
AAGTCACACCCGCCCACGACCCCAACGACTTCGAAATCGGGGTGCGCCACCAGCTGCCGATGC 
35 CCTCGATCCTGGACACCAAGGGCCGGATCGTCGACACCGGAACGCGATTCGACGGCATGGACC 
GCTTCGAGGCACGGGTCGCGGTGCGCCAAGCGCTCGCGGCCCAGGGCCGCGTGGTCGAAGAA 
AAGCGACCCTACCTGCACAGCGTCGGACACTCCGAACGCAGCGGCGAGCCGATCGAGCCGCG 

110 
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^^TCCCTGCAGTGGTGGGTCCGGGTGGAATCGCTGGCCAAAGCGGCCGGGGATGCGGTGC 
^rlj^^^GG^AC^GTGATTCACCCGGCCAGCATGGAACCCCGCTGGTTCTCCTGGGTCGACG 
^^q^^^^CTOGK^ATCTCGCGACAGCTCTGGTGGGGGCATCGGATCCCGATCTGGTACG 
^^A^G^CAGOTGTGCGTCGGCCCGGACGAAACACCCCCGCAGGGCTGGGAACAG 

^^a^^atacctggttttcgtcggcgctgtggccgttttccacgctgggttggc 

CG^^WGACGG^G^^GCTGGAAAAGTTCTATCCGACAAGCGTTCTGGTTACCGGCTATGACAT 

CTTGTTCTTTTGGGTGGCCAGAATGATGATGTTCGG 
» ^«^^/-rs/-nr2r5f5RCCCC3CAGGTGCCG1 

kGGGCAACGTCATCGACCCGCTGGATTGG 



rGGGTGGCCAGAATGATGATGTTCGGCACCTTCGTCGGCGACGACGCCGCCATC 
ACCCTCGACGGCCGCCGGGGCCCGCAGGTGCCGTTCACCGACGTGTTTCTGCATGGGCTGATC 

TGACTTGGCGGTGAG^^ 
GACCGACGCCGACCGCTGGATTCTCGGAAGGTTGG 



.CTTGGCGGTGAGCGAGGATGCCGTGCGGGCGTCGCGCAATTTCGGGACCAAGCTGTTCAA 

r -rCACTCGGTACGCACTGCTCAATGGCGCC 

CGCCAC rCGGTAC JATTCTCGGAAGGTTGGAAGAGGTTCGGGCCGAAGTTGATTCGGC 

CTTCGACGGATACGAGTTCAGCCGCGCTTGTGAGTCCCTGTATCACTTCGCCTGGGACGAATTC 



TGCGACTGGTACCTCGAACTGGCCAAAACGCAGCTTG 



CCCAGGGACTCACACACACCACCGCC 



GCTGCACCCGGTGATTCCCTTCCTCACC 



rTRCTGGCCGCCGGGCTGGACACGCTGCTGCGCCTC 

gIS^tg^tcmcgctgaccg^^gaatcgotggtcagcgccgactggccggagcc 

GAGGCGCTATGGCT^ VCGGATTAACGATATGCAGAAGTTGGTGACC 



GCACG^^^TGCGGGACTCGGATCTGAGCAACCAGGTGGCCGCCGTGA^CT^G^^^^GTGG^ 
^.^AnrrQGGCCCGGATTTTGAGCCGTCGGTCTCGTrGGAGGTrCGGCTCGGCCCCGAGA 
TGAACCGCACCG^GT^^^GAGCTCGACACCTCGGGCACCATCGACGTGGCCGCCGAGCGT 

SStoga^gttggccggcgcccaaaaggagctggcgtcgac^^ 

GGCCA^^CGGACTTTCTGGCCAAAGCGCCCGACGCCGTCATTGCCAAGATCCGGGACCGCCA 
GCGCGTGGCGCAGCAGGAAACCGAGCGCATCACCACCCGGTTGGCTGCGCTGCAATGA 

>Rv2482= plsB2 TB.teq 27869 15 :2789281 M W:88284 >e m b|AL 1 23456|MTBH37RV:c27 8 92ei. 

GTGA^AMCCGGCGGCCGATGCCAGCGCGGTGCTTACTGCCGAGGACACACTGGTGCTGGC 
^G^^GTCGAGATGGAGOTGATCATGGGCTGGCTGGGCCAGCAGCGTGCAC 
^C^ACT^GTTCGACATATTGAAGCTGCCACCGCGCAACGCTCCGCC^CGGO^ 
TGACGG^ACTGGTCGAGCAGCTCGAGCCCGGCTTCGCATCCAGCCCGCAATCTGGCGAGGAC 

GTG^T™^ 

rTGCGTACCGATCCCAGGCGCGCGCGGGTGGTGGCCGGCGAGTCGGCCAAGGTGTCCGAACT 
G^A^t^^ 

CCGCCGAGCGCTGTTGGCGCTGGOGCGCGCCGAATATCGGATCCTTGGACOGCAATACAAATC 
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TCCCCGGCTGGTGAAGCCGGAGATGTTGGCGTCCGCACGATTTCGTGCCGGCCTGGACCGGAT 
TCCG^^G^AC^^CGAAGATGCCGGGAAGATGCrrCGACGAACTCTCCACCGGATGGAG^ 

IgctctSgtagacctggtttccgtcctcggcaggctggctagccgcggcttcgatccggaat 
^a^Lagtatcaggtcgcggcgatgcgcgccggactggagoctcatccggcggtc 
Sgg^Scaccggtcctacatcgacggcgtggtggtaccggtggccatgoaggac^ 
^a^cVggtgcacatgttcggcggcatcaacctgtcgttcggtctcatgggacccctc 
^gcgg^ct^ggggatgatcttcatccggcgcaatatcggcaacgacccactgtataagtacg 

g^gTcgcgcaccggaaac^^ 

atgcttacctggacggccgcagtgacgacatcctgctgcagggggtttcgatttgcttcgatca 

3CC 

tgcgctggctctacaacttcatcaaggcgcagggggaacgcaacttcggcaagatctacgttog 
^cg^gc^totcgatgcgccagtacctcggcgcaccgcacggcgagctgacccagg 
^^^acggcttgcgttgcagaagatgtcgttcgaggtggcctggaggattttgc 

, S A^rcSS^GACCGCGACGGGTTTGGTGTCCGCACTGCTGCTCACCACCCGCGGCACC 
G^G^GACGCn^^CAGCTGCACCACACGTTGCAGGACTCACTGGACTATCTGGAACGCAAA 
GCGTTGACGCTCGAl* rGCGCTCGCGCGAAGGCGTCCGTGCGGCGGC 



10 ^c^agTt^aatacgccgcctacgcc^tggcgcggagaagacgcccgaaggtt 



OAATCGCCGGTTTCGACAAGCGCATTGCGACTC 



SGGTCGACAGTGGCCGGGAGCCGGTATGGT 



aGAGAGCTCGATGGTCGAGCTCGCGCTGGCCCATGCCAAGCACGCCGAAGGTGACCGCGTCG 
^GCGTTCTGGG^ic^3GCGATGCGGTTGCGGGATCTGCTGAAGTTCGACTTCTATTTCGCGG 



20 

CCGCGTTCTGGGCCCAGGCGATGCGGTTGCC ...-at- 

a^a^ggLtttcgggccaacatcgcccaagagatggcctggcacoaagactgggaggatc 
a^g^gTgggcaat^^ 



ACGCGATGTTGCGGGTCTTCTTCGAAGCCTATGA 



GATCGTTGCCGACGTGTTGCGCGATGCTCC 



GCCTGACATCGGTCCTGAGGAGTTGACGGAGCTGGCGCTCGGCCTCGGCCGTCAGTTTGTGGC 

3GCCGGGTCCGCAG 

CGTCGATCAGGAGC 
CGGCGGGAGTTACGAAACATTCTGCGGGATTTCGAC 



25 ^ • ' — ' ~~ ' - ~ rATCGACGCTGCTGTTCGCCACTGCACGCCAGG 

-rmrCGTCGATCAGGAGCTGATAGCGCCGG 

TCGCCGTCGATUAoo vCTATGTCGAGCAGATCGCGCGCAACCAG 



acagggccgggtccgcagcagcgaaccggt; 

Ig^cgatcaggagctgatagcgccggcggccgacgtcgocgaacgtagggtcgccttc 



AAAGCGCGfCAAGGACGCGACCGAATCTAA 



TTCGTCGCCTGCGAGTTC 
30 

>Rv2S09 - putaSve oxldoreductase TB.s«, 2824676:2825479 MW:28014 
>.mb|AL123456|MTBH37RV^824676.282S482.Rv2509 SEQ ID NO:99 

atgccgatacccgcgccca.sccccgacgcacgtgcogttgtcaccggggcttcgcagaa^tc 
ggc^^^cgctggccaccgaactggccgcacgcgggcaccacctgatcgtca^^g^ac^ac^ 
« cgaggacgtgttgaccgagttggctgcccggctggccgacaagtaccgcgtcacggtcgacg 

35 ^^^™<^™**^-<^^^^^ 

cggcccatctcgatcctgtgcgccaacgcgggtaccgcgacattcggcccgatcgcatcgctc 
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GATC 



TTGCCGGCGAAAAGACGCAGGTGCAGTTGAATG 



CCGTGGCGGTGCACGACCTTACGTTG 



GCGGTGTTGCCGGGCATGATCGA 



GCGCAAGGCCGGCGGCATCTTGATTTCTGGTTCGGCGGCC 



GGCAATTCACCGATTCCCTACAACGCCACCTA 



TGCCGCGACCAAGGCCTTCGTGAACACCTTCA 



GCGAATCTCTGCGCGGTGAGCTACG 1 



CGGCTCCGGCGTGCACGTCACGGTGCTGGCCCCGGGC 



GAAGCGTCACTGGTCGAGAAGCTGGTGCCGGAC 



GCCATCGTGGCGCCAATCGTGGGTGCCTTTTACAAGAGGCTTGGGGGCAGCTAG 

iCTGGTGAGCCCTACGCTGTCGCATTCGGTGGCCAGGG 
rGGTGTCGGCCACCGGGATAGAAACCGAGTTGGC 
VTCCGGTCACCGACGAGCTGATTGTGGTGCG 
GGCGCACTGGCGGCCGAGGACCCGGTTCCGT 
GTGCCCGGCGTGTTGCTTACCCAGATCGCGG 



>emt 



CCACGCTCTGGTCGATCGCCTCATGGi 



GTGACGATCCACGAGCACGACCGGGTGTCCGCTGATCGCGGCG^GGACAGCCCGCATACCAC^ 

ACGCTCTGG 1 
CAGCGCCTGGCTGGAAACCCTCGAAGAGCTC 

15 GACGTTGGTCGGTGAGGCAGAGCTGTTGCTCGAT 

CCGACAA^uw,^ - "TCGTGGCCACCCCGCCGGTCGCCATGGCG 



CCCGATCGGTTTCGAGCCGCTGCAATGGGTA 

:aagcacctgacgtcggccgccgtgtcg 

CCCGGGCGCTGGCCCGTCAAGGCATGGACC 



CGA< 



OCt^^T^^^^^^GAGTTCGCCCAGGACGTGCGCACGGTGCTGCCACCGGTGTTGTCCATCC 

^^ccgI^cgagccggtgcaggtggaggtgggctttcacaccccgcggctatcc 

" GAC^^An^^ATCG^^S^GGCTGGGCCGAGAAGGCGGGCCTCGATGTCGCCTTGGCTCG 
GGAGCTGGCCGATGCCATCTTGATCAGAAAGGTCGACTGGGTCGACGAGATCACCCGTGTCCA 

CACCGGTGATCCGCGGO GGTTGCCCGGGCCTGGTCGAGCTACGCACCGACC 



GCCGGGCGACATCCTGACCCGACTGACCG 

ATCG' 

XTO^^^CCTGCTOGMGGCATGACCCCGACCACCGTGGACGCCAAGATCGTCGCCGCG^ 
rr^^C^CACTGGGCCGAGCTGGCCGGCGGCGGGCAGGTCACCGAAGAGATCTTC 
GGTAACCGCA^^^V^^AATGGCCGGCCTGCTCGAGCCGGGCCGCACCTATCAGTTCAACGCG 

^St^cc^tgtggaagc^caggtgggcggcaagcggttggtgcagaaggcc 
cgccact^^ggcgccgcgatcgacggcgtggtgatcagcgccggcatcccagacctcga^ga 
ggccgtc^^ctgatcgacgaactgggcgacatcggcatcagccacgtcgtgttcaaacccgg 
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..rrATCGAGCAGATCCGCTCGGTGATTCGCATCGCCACCGAGGTGCCCACCAAGCCGGTGAT 

CATGCACGTCGAGGGCGGGCGCGCCGGCGGGCACCATTCCTGGGAGGATCTCGACGACCTCC 

TrtCTGGCTACCTACTCGGAGTTGCGCTCACGCGCCAACATCACGGTGTGCGTCGGCGGCGGCA 

TTCGCA^CCCGAGAAGGGCTGCGGAATATTTGTCCGGGCGCTGGGCGCAGGCCTACGGCTTCC 

™^^^GGCATCCTGGTCGGCACCGCGGCGATGGCCACCAAGGAATCCACCA 

^C^^^^TGCTCGTCGACACTCAGGGCACCGACCAATGGATCAGCGCCG 

qaaaaGCG^GGGCGGCATGGCCTCCAGCCGCAGTCAGCTCGGTGCCGATATCCACGAGATC 

tcgcggagtotcgcgacgagatcatcgcggcgatggccaagaccgccaagccctacttcggc 
g^^gtcgccgacatgacctacctgcagtggctgcggcgctacgtcgaactggccatcggggaa 

ccgcttcga^caga^gctocagcgtgccgaagcccggttgcacccacaggatttcggcccg>^ 
cSga^a^caccgatgctggcctgctggacaatccgcagcaggcgatcgccgccctg^ 

. S^ct^cgacgccgagaccgtgcagttgcat^^ 
^gtgI^gctgggcaagccggtcaacttcgtgccggtgatcg^^ 

rnTGCTGG^G^W3CGACTCGCTGTGGCAGGCCCACGACGCCCGCTACGACGCCGATGCGGTG 
TGCATCAnX^^GGW^CGCGTCGGTAGCCGGCATCACCCGGATGGATGAACCCGTCGGTGAG 



qq^qCGTTTCGAGCAAGCCGCAATCGATGAAGTGCTCGGCGCCGGTGTCGAGCCGAAG 
"CGCGTC 

CGCACCCGATG , — iAAAAC CCGCGCGCCACACACTCATCCACCGGC 



TTGCTG^oo, . ■ — -TGGCCGGACCGTTGGCTGTCGTCCTCGA 

rGTGCGCTGGGCCGGTCGCACCGTGACCAACCCGGTGCATCGGATCGCCGACC 



GATGTC 



GCGTCGCGCCGGCTGGGCCGCGCCGACGTC 



qqACA^SgA^TCACGTTGCCGGCCAACACCGTCGATGGCGGCACCCCGGTGATCGCCACCGA 
GGACGU^~— .ctttgacqGTGGACTGGCACCCCGAGCGTGTTG 



.cgocaccagcgccatgcgcacggtgctggcgatcgcggccggtgtcgacagcccggagt 
^^^^c^^^^g^^^a^^^a^^^tcggtgagccgctggcacccagtotcaccaacgtg^ 

CCGACCACACCb^ fCGCGGCCATCGGATCGGCGGTCACC 



CCCGACGCGCTCGTCGGCCCTTGTTGGCCAGCGG 
GACA< 

rGCCGGTCTCGGTCGTCGTTACCGGCGCCGATGGCG 



CCGGTGAGCCGGTGGTGGAAGGCCJGCTGAGCCTGGTGCATCTGGACCACGCCGCCCG 
CGTGGTCGGTCAGCTGCCCACGGTCCCGGCCCAATTGACCGTCACCGCAACGGCTGCCAACGC 



AACCGATACGGACATGGGCCGCGTCGTC 
CCGTGATCGCCACTCTCGAGGAGCGATTCGCGATO 



CTGGGTCGCACCGGTTCCGCCGAGCTCG 
^orrcGGCGCGAGCCGGTGGCGCGGTGTCGGCGAACGCCACCGACACCCCGCGCCGTCG 
^O^^^ACCGCGCCGGTCGAOATGCGCCCGTTOGCGGTGGTGTC^GCG 
Ar^^^^^^^CACACCGACCGGGCCGCCGCGCTGCTTGCCGGCCTGGAGTCGCCGATC 
GTGCACGGCAT^i^^CTGTCGGCCGCGGCGCAACACGCGGTGACCGCCACCGACGGGCAGG 
^G^^GCOCGGCTGGTCGGCTGGACCGCGCGGTTmGGGCATGGTG<^^GC 
GACGAGGTG^ACTTCCGCGTCGAGCGCGTCGGAATCGACCAGGGCGCAGAGATTGTGGACGT 

114 
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rGCCGCGCGCGTCGGGTCGGATCTAGTGATGTCGGCCTCCGCGCGACTGGCCGCACCCAAGA 
^TCT^A^CGGCCAGG^ATCCAACACAAGGG^TGGGCATGGAGGTGCGCGCC 
MCTCCAAGGCGGCCCGCAAGGTGTGGGACACCGCGGACAAGTTCACCCGCGACACCCTGGG 
C^CTCGGTACTGCACGTGGTCCGCGACAACCCGACCAGCATCATCGCCAGCGGTGTGCACTA 
5 CCACC^CCCGACGGGGTGCTCTACCTGACGCAGTrCACCGAGGTCGCGATGGCGACGGTGG 

Sg^^aggtcgccgagatgcgtgaacagggagccttcgtcgaaggcgccatcgcgtgc 

SccStc^gcgagtacaccgcgctggcctgcgtgaccgggatctaccaactggaagc 

^to^4gtgtttcaccgcgggtcgaagatgcac^catcgttccgcgcgacgagct 

r^«^AACTATCGGCTGGCGGCCATCCGGCCGTCCCAGATCGACCTCGACGACGCCG 
,„ ^G^G^CGTCGCCGGGATCGCGGAGAGCACCGGTGAATTCCTGGAGATCGTGAATT 
TCAA^^^S^GCTCGCAATACGCGATCGCGGGCACGGTACGCGGCCTCGAGGCGCTCGAG 
^C^A^G^^AGCGGCGCCGCGAGCTCACCGGCGGCCGACGGTCGTTCATTTTGGTGCCCGG 
^^T^GTTCCACTCGCGAGTGCTGCGGGTCGGGGTGGCCGAATTCCGGCGCTCGCT 

15 

^/-./^-rr^rs AOfS Art ATCCTCGCCGACTACGA 

GCAATTCGCCAGCCCGGTGCGCTGGATCGA 



rrTGCCGCGGTTGTTCACCCTGGACCGCGACTTCATCCAGGAAATCCGGGATTTGGTGCCCGC 



GGCGCGCACGGTGTTCATCGAGCTGCTGGCATG 

AGAT^SG^GTGAA^GCT^^GAC^GTGGCGGGTCTTGCCACCAACACCCTCAAACTGCCCG 



GA< 



.CGCAGGATCTGCTGTTCATCGAGGAGGCCGCCGGCGGGCTGGGTGTGGAGCGATTCGTCG 



AATACGCCCACAGCACAGTGGAAGTGCTCAACGCCGAGCGTGATGCCGCGGTGCTGTTCGCCA 
CCGACA^^^C^GAGCCGGAGCCGGAGGAAGACGAGCCGGTCGCGGAATCGCCCGCGCC 



20 

GGACGTCGTCTCGGAAGCCGCCCCCGTCGCGCCGGCCGCTTCGTCGGCGGGCCCGCGTCCCG 
ACGA^TCTGGri^TCGACGCCGCCXSATGCCACGCTGGCGCTGATCGCGCTCTCGGCCAAGATGC 
ACGA I CTGGTTT rCGACTCCATCGAGTCCATCACCGACGGTGCGTCGTCGCGGC 



GCAACCAGCTGCTGGTGGACCTGGGCTCCGAGCTGAACCTCGGTGCCATTGACGGCGCCGCC 
GA^CGG^CTGGCCGGTCTGCGCTCACAGGTGACCAAACTGGCGCGCACCTACAAGCCTTAC 



GCATCGACCAGATCGAAGAACTi 

CGGCCCGGC^CCATCGCCGAGCGGGTGAAGAAGACCTGGGAGCTCGGTGAGGGCTGGGCCA 
^TGT^TCGAGGTCGCGCTGGGCACCCGCGAGGGCAGCAGCGTTCGCGGCGGCGCC 
M A^GG^G^CA^CT^CACGAGGGCGCGCTGGCCGATGCCGCCTCCGTCGACAAGGTCATCGACGC 
GG^GTCG^^TCGGTGGCCGCGCGCCAGGGCGTTTCGGTAGCGCTGCCGTCGGCCGGTAGTG 

gScggScIa^^ 

^GG^CGTCCTGGCCTCCGCGGCCCGCCTGGTGCTGGGGCAGCTGGGACTGGACGACCOCGT 
CM^GCCTTGCCGGCCGCCCCCGATTCCGAGCTGATCGACTTGGTCACCGCCGAACTGGGAGC 
35 GGACTGGCCGCGGTTGGTGGCACCGGTGTTCGACCCCAAGAAGGCCGTCGTATTCGACGACC 
GCTGGGCCAGCGCCCGCGAGGACCTGGTGAAGCTGTGGCTGACCGACGAGGGCGACATCGAC 
GCCGACT^CCGCGCCTGGCGGAGCGCTTCGAGGGTGCCGGCCACGTCGTGGCGACCCAGG 

115 
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^ T nrrAA GG TAAGTCGCTGGCCGCGGGCCGGCAGATCCATGCATCGCTGTACGGCC 
ACCGGCGCTTCG AAGGGTTCG ATCGCCGCGTC ^ «qqq jgGCGTTCTACCGCACGCTGTA 



5CCGCGTCGGTGGTGGCTCGGCTGCTCGACGGCGGAGC 



CA( 



GCGTCATCGCGACCACCTCCAAGCTCGACGAGGAGO 



TCGCGACCACGCCCGTTACGGCGCGGCGCTGT 



GGCTGGTCGCGGCGAACATGGCGTCCTACT 
W3ACCGAAAGCCTTGGGCCGCAGT 



GCTGTTCCCGTTCGCGGCGCCACGCGTGG 



CCGACGTCGACGCCCTGGTCGAATGGATCGGCACCGAACAC 
;ATCAAAGACGCGCAGACCCCGACGC1 
CCTGTCGGAGGCCGGTTCGCGCGCCC 
r-^x »-r/^^«/~r2ranrrTGTCGACGATC 

5CATGTTCGGCGGCGACGGCGCCTACGGCG 
10 GCACGTGG1 <*. . — ■ — - ■ ■-- — jqgcTGGCACGCCGAGTCGTCCTGGGCGGCA 



TCGGGGACCTGTCGGAGGCCGGTTCGCGCC _ ^QAACGCGACATCGCGTCGCGGCT 



JGCCGAGATGGAGATGAAAGTGCTGCTGTGGGCC 

GTGCAACGGCTGATCGGCGGCCTGTCGACGATCGGCGCCC 
^CGTGGTGCTGCCCGGCTCGCCCAACCGTGGC 

^ ^!°^^CGTCGAAGAQGCCGGOGTCACCACCTACTCQACCGACQAGATGQ 



TCGO 



CGCGCTGCCGTCGCCGCCCCGGGGTTTCA 



CTCGATGTCGACCCGGCCGACCTGGTGGTGA 



JCGTCGGCGGCGCCGAAATCGGCCCGTACGG 



AGCGCGTCGGCATTCGCGAATTCGTTGATGACGGC 

GCGCCTTCGTCGAGTTCGATCCCQAGCACACGGTCATCCGGCCGGTGCCCGACTCCACCGACT 



rGGTGTCGGTGTTCCTGGAGAAGGACTTCGCGTTCGTGGTGTCCTCGGAGGCCGATGCGC 



TGCTC 

^ » ^-rrr/; ATr.r.r.GAGCACAC 

TCCGGGTGCCGCGAAAGACCAAGCTGTCCCGC 



25 GGCAGGTCATCCGCAAGGCCGGCACCGAGA" 



GTCGTCGGCGGCCAGATCCCGACCGGGTTCGA 



.CCCGACGGTGTGGGGCATCAGCGCAGACAT 



\TGGTGGCGACCGTCGACGCGTTCCTGTC 

3CGGTTCCA l O^jmoovjvj ■ ■ ^ — 

Af-irvmATfSCGTTACC 

GTC 



GGCCGGTTCCATCGACCGGTTGGCGGTATGGAACA1 
~-^-^^-r a /^r*T AnrwrGCGATGATO 

3TTGGGCAAGGCTCAACTGGTGGTGGCCGGCG 



GTGCTACGTCGGTAGCTACGGTGCGATGATCCACCCGGTAGOCGCGTGCGCCACCGGCGCGOT 

^^^^^^^^0^i^Q^SAGQGCM^^TCGGATTCGGTGACATGGCCGCCACCGCCGACA 
^TCCA^G^TG^GCGGCCGCGGCATCCACGACTCGAAGTTTTCCCGGCCCAACGACCG^G^ 
CGTMA ^°™C CGGCGGGACGATCCTGTTGGCCCGCGGGGACCTGGCG 

GCA^CCTCGArcCCGGCCCCGGGCCTGGGCGCGCTGGGGGCGGGCCGCGGCGGCAAGGAT 
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BNSDOCID: <WO 013S317A1 
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TCACCGCTGGCGCGGGCGCTGGCCAAGCTGGGCGTGGCCGCCGACGACGTGGCGGTCATCTC 
CAAGCACGACACCTCGACGCTGGCCAACGATCCCAACGAGACCGAGTTGCATGAACGGCTCGC 
CGACGCCCTGGGCCGTTCCGAGGGCGCCCCGCTGTTCGTGGTGTCGCAGAAGAGCCTGACCG 
GCCACGCCAAGGGCGGCGCGGCGGTCTTCCAGATGATGGGGCTCTGCCAGATATTGCGGGAT 

5 GGGGTGATCCCACCCAACCGCAGCCTCGACTGCGTCGACGACGAGCTGGCCGGCTCCGCGCA 
TTTCGTGTGGGTGCGTGACACGTTGCGGCTCGGCGGCAAGTTCCCACTCAAGGCCGGCATGCT 
GACCAGCCTCGGGTTCGGCCATGTGTCGGGCCTGGTCGCGTTGGTGCATCCGCAGGCGTTCAT 
CGCCTCGCTGGATCCCGCACAGCGCGCGGACTACCAGCGGCGTGCCGACGCCCGCCTGCTGG 
CCGGTCAGCGCCGGCTGGCCTCGGCGATTGCCGGTGGTGCGCCGATGTACCAGCGGCCCGGT 

10 GACCGTCGCTTCGACCACCACGCGCCCGAGCGGCCGCAGGAGGCGTCGATGCTGCTGAATCC 
GGCGGCCCGGCTGGGTGACGGCGAGGCGTATATCGGCTGA 

>Rv2555calaS alanyWRNA synthase TB.seq 2873772:2876483 MW:97326 
>emb|AL123456|MTBH37RV:c2876483-2873769, alaS SEQ ID NO:101 
15 GTGCAGACACACGAGATCAGGAAGCGGTTCCTCGATCATTTCGTGAAGGCGGGCCACACCGAG 
GTGCCCAGCGCCTCGGTGATCCTCGACGACCCCAACCTGTTGTTCGTCAACGCCGGGATGGTC 
CAGTTCGTGCCTTTCTTCTTGGGACAGCGCACGCCGCCGTACCCGACGGCCACCAG CATCCAG 
AAGTGCATCCGTACCCCCGATATCGACGAGGTGGGCATAACCACCCGGCACAACACGTTTTTTC 
AGATGGCCGGCAATTTCAGCTTCGGCGACTATTTCAAACGCGGGGCCATTGAACTGGCCTGGG 
20 CACTGCTGACCAACAGCCTCGCCGCCGGCGGCTACGGCCTGGACCCGGAAAGAATCTGGACG 
ACAGTCTATTTCGACGACGACGAAGCTGTCCGGCTATGGCAGGAGGTTGCCGGGCTGCCGGCG 
GAGCGAATCCAGCGCCGCGGCATGGCCGACAACTACTGGTCGATGGGCATTCCCGGACCGTG 
CGGGCCGTCATCGGAGATCTATTACGACCGCGGACCCGAATTCGGTCCCGCAGGCGGTCCCAT 
CGTCAGCGAAGACCGCTACCTCGAGGTCTGGAACCTGGTGTTCATGCAGAACGAGCGCGGAGA 
25 GGGAACCACCAAGGAGGACTACCAGATCCTCGGGCCGCTGCCCCGCAAGAACATCGACACCG 
GCATGGGCGTCGAGCGGATCGCGCTGGTGCTGCAAGACGTGCACAACGTCTACGAGACCGAC 
CTGCTCAGGCCGGTCATCGATACCGTGGCCAGGGTCGCCGCGCGTGCCTACGACGTCGGCAA 
CCACGAAGACGACGTGCGGTACCGCATCATCGCAGACCACAGCCGCACCGCCGCGATCCTGAT 
CGGTGACGGCGTCAGCCCCGGCAACGACGGTCGCGGTTATGTGCTGCGCCGGCTGCTGCGTC 
30 GGGTGATCCGCTCCGCCAAGCTGCTGGGCATCGACGCTGCGATCGTTGGCGACCTGATGGCCA 
CGGTGCGCAACGCGATGGGCCCGTCATATCCCGAACTCGTCGCCGACTTCGAGCGGATCAGCC 
GGATCGCGGTCGCCGAGGAGACGGCGTTCAACCGCACGCTGGCGTCGGGTTCCAGGCTGTTC 
GAGGAGGTGGCTAGCTCCACCAAGAAATCCGGAGCCACCGTGCTGTCCGGATCGGACGCTTTC 
ACGTTGCATGACACCTACGGGTTCCCGATCGAGCTCACGCTGGAGATGGCGGCCGAAACCGGT 
35 CTGCAGGTAGACGAAATCGGGTTCCGTGAGCTGATGGCCGAGCAGCGCCGCCGTGCCAAGGC 
CGACGCCGCCGCGCGCAAACACGCGCATGCTGACCTGAGCGCCTACCGCGAGCTGGTTGACG 
CCGGCGCCACCGAGTTCACCGGATTCGACGAGTTGCGTTCCCAGGCGCGGATTCTGGGCATCT 
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PCT/USOO/31152 
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TCGTCGACGGTAAGCGGGTTCCGGTGGTGGCGCACGGTGTAGCCGGCGGAGCCGGGGAAGG 
GCAG^G^GTCGAACTTGTCTTAGATCGCACCCCGCTCTACGCCGAATCGGGTGGGCAGATCGC 
CGATG^GGGCAC^TCAGCGGAACCGGTTCCAGCGAAGCTGCCCGGGCCGCGGTTAC^GACG 
T^rAGAAGATCGCCAAAACGCTTTGGGTGCACCGAGTCAACGTGGAATCCGGGGAATTCGTCG 

5 AGGCT^^AC^GTA/JCGCGGCGGTGGATCCCGGGTGGCGCCGGGGTGCCACGCAGGGCCA 
C^^^C^TGGATGCCGCGCTGCGACAAGTGCTGGGGCCCAACGCGGTTCAGG 
CG^GATCGCTGAACCGGCCGGGATATTTGCGCTrCGACTTTAACTGGCAGGGTCCGTTGACCG 
^^^A^AGGTCGAAGAGGTCAGCAACGAGGCCGTGCAAGCGGACTTCGAGGTG 
CG^^C^^C^^GAACAGCTCGACAAGGCCAAGGCGATGGGTGCCATCGCGCT^n^^GGCGAG 
^rTArr^CGACGAAGTGCGGGTGGTGGAGATGGGTGGACCGTTCTCGCTGGAGCTATGTGGC 

" GGC^^^^^TGAG^ACACGGCGCAGATCGGTCCCGTGACGATCCTGGGCGAO'TCG^^^TC 
GGCTCCGGGGTGCGCCGGGTGGAGGCCTACGTGGGGTTGGATTCGTTTCGTCACCTGGCCAA 
GGAG^GTGCGTTOATGGCCGGGTTGGCCTCGTCACTGAAGGTGCCGTCCGAAGAGGTACCGG 
C^GGGTGGCCMTCTAGTGGAGCGCCTGCGGGCCGCCGAGAAGGAACTCGAACGTGTCCGG 

„ ATCGCCAGCGCCCGGGCAGCCGCCACCAATGCCGCCGCCGGGGCTCAGCGGATCGGTAACGT 

^™g^cSa^gaatgtccggcgggatgacc<^ggcagacctgcggtcgttgatcg 

t^g^cmgcCgggtagcgagccggcggtggtggc^ 
agccaaactg^g^cgwtgcggtcgcggccaatcccgctgcccaggacctcggaatccgtgcc 



GCGACA1 



CTTGCGGTGGCGGTCGAAGGCCGCGGTGGCGGTAAGGCGGACCT 
GGCGCAGGGCTCGGGAAAGAATCCGACCGGTATCGACGCCGCGCTCGACGCGGTCCGCTCCG 



AACGACCTGGTCAAACAA 

20 



25 



30 



35 



AGATCGCCGTGATAGCGCGGGTCGGTTGA 

>Rv2580c hisS his«dy.-tRNA synthase TB.seq 2904822:2906090 MW.451 18 
kiai io^fiiMTBH37RVc2906090-2904819. hisS SEQ ID NO:102 

gtgacggw^^ctcgt^^ttttcggcccccaagggggtaccggactacgtcccgcccgactcg 
g^^cagttcgtcgcggtgcgcgacgggctgctcgcggcggcccgtcaagccggctatagcca 
c^^g^gctgcccatcttcgaggacaccgccctgttcgcccggggcgtgggtgaatcca^^^ 
^ggtgtccaaggagatgtatacgttcgccgaccgtggcgaccgctcggtgacgctgcggcc 

cgag^ca^cgggg^^ 

ccggtgaagt^^gttatgcgggcccgtttttccgctacgagcgtccgcaggccggccggtat 
cgSIg^acagcaagtcggggtggaggcgatcggcgtcgacgacccggcgttggacgccga 
ggtgat^^c/^gccgacgccgggttccgctcgttgggtctcgacgggttccggctggaaat 

atcCTLTGGGAGACGAGAGTTGCCGTCCGCAGTACCGGGAACTGTTGCAGGAGTTOTGTTr 

ggactc^tctcgacgaggacacgcgcaggcgcgcagggatcaatccgctgcgggt^ 
^cIagc^ccgaattgcgtgcgatgacggcgtcggcgccggtgttgctggatcatc^ 
^^xg^osccaagcagcatttcgacaccgtgctcgcccatctggacgcgcttggagtgccctat 
gtcatcaacccgcgcatggtgcggggcctggactactacaccaagaccgccttcgagttcgtc 



BNSDOCIO: <WO 0135317A1 I > 



PCTAJSOO/31152 

WO 01/35317 



CCAGCTTGGCGGGCAGGACTTGTCGGGCATCGGGTT 



TCGGGGATCGGCGGCGGGGGGCGCTACGACGGCCTGATGCA 
; G TCGGGCATCGGGTTCGGGCTGGGCGTGGACCGGACCGTGC 
GGGCAAGACGGCGGGGGACAGCGCCCGGTGCGACGTGTTCGGCGT 

s osocmca^^cc^o^ QCQAOCGCGACATC gao<3ocgggacggtcg 

TAATTTCGCGGCTGGCTGGGTAG 
10 > R v26140»,rS mre ony t «N A synm3S.T B .s«,2 9 41190 : 2943265 M W : 771 2 3 



GTA 



.CGCCCGATGCGATCGTCGTCGTG' 



CGCGACGCCGACGGCAACCTGCGCGACCTGAGCTGG 



15 



GTGCCCGACGTCGACACCGATATC, 



ACGCCGGTGGCCGCCAACACCGACGACGGTCGCAGCGT 



JCCGCCATTCGACCGCGCACGTG 



GA 



GCTCGGCA 



iTTGGCCCAAGCCGTCCAAGAGCTGTTTCCGCAGGCCAA 



JCGGACCACCCATCACCGACGGCTTCTACTA 



CGACTTCGACGTGCCCGAGCCGTT 



GAAAAGCGGATGCGCCAGATCGTCAAGGAAGGCCAGC 
GGCCCGCGCCGAGCTGGCCAACGAGCCC 



20 TACAAGCTGGAACTCGTCGACGACAAATCGGGT. 
GAGCTC 

CCGCGGAC^^.^- ^ -jcTGCAACGGATCTACGGCACCGCGTGGG 



^CTGGCGGGGCGATCAGAAAAACGCCAGCC 



CGCCTACT^^o^..^ \GTTCATCGAAGAGGCGCAGCGCCGCGACCAC 



iTCCCAGGAGGCGCTCGACAGGCACCTGGA' 



25 



30 



ATGCACTGCCTGATCTTCCGCGCGCGCGGG 



CGATCCTATCGGGAACTGCCGTTGCGGCTCTTC 



GAGTTCGGCACGGTGTATCGCTACGAGAAGT 



CCGGTGTGGTGCACGGGTTGACCCGGGTGCGT 



GGGCTGACCATGGACGACGCGCACATCTTCTGC 



ACCCGCGACCAGATGCGCGACGAGCTGCG 



GTCGCTGCTGCGGTTTGTGCTCGACCTGCTCGCCGA 



GTACGGCCTCACCGACTTCTACCTCGAA 
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BNSDOCID: <W0 013S317A1 I > 



10 



15 
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GAACCCGCCACCGC(XGGTGATGATCCACCGCGCGCTATTTGGGTCGATCGA6CGGTTCTTCG 
G^^Cn^^CGAGCACTACGCGGGGGCGTTCCCGGCCTGGTTGGCGCCCGTGCAGGTGGTC 
GGCATCCCGGTCGCCGATGAGCACGTCGCCTATCTGGAAGAGGTTGCCAGGCAACTGAAGTCG 
CA^GG^^GCGGGCCGAGGTGGACGCCAGCGACGATCGGATGGCCAAGAAGATCGTGCACCA 
CACCA^XACAAGGTGCCGTTCATGGTGTTGGCGGGTGATCGTGACGTCGCCGCCGGCGCGGT 

I^Sggttcggtgaccgcaccca^ 

^GTCGCCTGGATCGCTGACCGCGAAAATGCGGTTCCTACAGCGGAACTGGTGAAAGTGGC 
CGGTCGTGAGTGA 

>Bv2697c dut deowndine ^-phosphatase TRsaq 3013683:3014144 MW:1 5772 
>„b|AL123456|MTBH37RV:c3014144-3013680.dut SEQ1DNO:104 ^ r ^ m . u . 
Q^GTCGACCACTCTGGCGATCGTCCGCCTCGACCCCGGGCTCCCGCTGCCCAGCCGCGCTCAC 

GACG^CG^^^XGGCGTTGATCTCTACAGCGCCGAAGACGTCGAGCTGGCACCTGGGCGCCG 
CGCCCTGGTACGGACGGGTGTTGCGGTCGCCGTCCCGTTCGGCATGGTCGGGCTGGTCCATC 

cgcgctSg^ggccaogcgggtggggctttcgatcgtcaacagtccgggca^^ 
Sggg^g?tggggagatcaaggtggccctgatcaacttggacccagccgcgcccatcg^ 

gt^cgcg^^^ 

gIg^cgtcgttcgacgaggcogggotggcctcgaoatoccgcggcgacggt^ 
ttcctccggcggacatgcgagtttgtga 

>Rv2782c pepR profaasa^dase. M16 family Onsulinasa) TB.aeq 3089045:3090358 MW:47074 
,» m MAL123456IMTBH37RV:c3090358-3089042. pepR SEQ ID NO:105 

^GCCGCG^CGGrcACCAGCTGACCCCGCGGCGGCGCTGGCGCCGCGGCGCACCACCCTGC 
CG^^^GCGAGTGGTCACCGAATTCCTGCCCGCGGTGCACTCCGCGTCGGTCGGGGTG 
TGGG^^^^^T^SGATCGCGCGACGAAGGCGCCACGGTGGCCGGGGCGGCGCACTTCCTTCA 
GCAT^GCTGTTCAAGTCGACGCCCACCCGCTCTGCCGTGGACATTGCGCAGGCGATGG^ 

ggt"^cggggaactgaacgcattca^^ 

CGGCAGCGACTTGCCGTTGGCXJGTCGACCTGGTCGCCGATGTGGTGCTCAACGGCCGCTGTGC 

c^cgatgtcgaggtggaacgtgacgtcgtcctcgaggagatcgcgatgcgcgacgacg 
ao ac^cogaggacgccttggcggacatgttcctggcggcgttgttcggcgaccacccggtcggtc 
gSc^tg^gcagcgcgcaatccgtgtcggtgatgacgcgggctcaactgcaatc^^ 
a^^ctataccccggagcggatggtcgtcgcggccgcc^caatgt^ 

gLctggttgcgttggtccgcgagcacttcgggtcccgg^ 
tgcgccgcgcaagggtaccggccgggtcaacggcagcccccggttgacactgg^agccgcg 

» acgccgaacagacgcatgtgtcgctgggcatccgcacacccgggcgcggctgggagca^^t 

tg^cactgtcggtgctgcacaccgcgctgggcggtggcttgagttcccggctgtt^ag^ 

gtccgcgagacccgogggctggcctactcggtctactccgcgctggatctottcgccgacagc 

120 



20 



25 
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PCT/US00/31152 

WO 01/35317 

GGCGCGCTTTCGGTGTACGCGGCCTGCCTGCCCGAACGCTTCGCCGACGTGATGCGGGTGAC 
GGCGCGC 5CGACGGCATCACCGAGGCGGAATGCGGCATCGCCA 



CGCCGATGTGCTGGAAAGCGTGGCACG 
AGG' 

GGTCAACGCGGTGGCCCGCCACCTGCTGAGCAGGCGCTACGG 



-GATCGCTGCGGGGTGGGCTGGTGCTAGGGCTGGAGGATTCCAGCTCCCGGATGAGCCGG 
CTCG^CCGCAGCGAGTTGAACTACGGCAAGCACCGCAGCATCGAACACACCTTGCGGCAAATC 



GGGTAG 

>R»2783c 9PSI PPPGPP W" 1 " 350 an " PPlytbonucleotkl. phosphorylase TB.seq 
3090339.3092594 MW:79736 ^mb|AL123456|MTBH37RV:c3092594-3090336. gpsl 

ATGTCT^CG^TGAAATTGACGAAGGCGTGTTCGAGACGACCGCCACCATCGACAACGGGAGC 

^Tt^CACCCGGACCATCCGCTTCGAGACCGGCCGATTGGCCTTGCAGGCCGCCGGCGCGGT 

QQTCG^CT^^CTCGACGACGACAACATGCTGCTGTCGGCGACCACCGCCAGCAAGAACCCCAA 

^^C^^i^C&^C^CTTCXCCC7CACGGTCGACGTCGAGGAGCGCATGTATGCGGCCGGCCG 

CATCCCCG^^CGTrCTTCCGTCGCGAGGGCCGACCCTCCACCGACGCGATCCTGACCTGCCG 

g^catcg^gc^ 

G^GA^^^TC7CAGCCTGGATCCGGGCGATCTCTACGACGTATTGGCGATCAACGCGGCX3^TC 
GGTGACGATTCIL fCTCCGGGCCCATCGGCGGTGTGCGGGTGGCGC 

OCGTCGACCAGATCGAGCGCGCCGTGTTCGACA 



GGCGTCCACCCAGCTGGGCGGTCTGCO 

TCATCGACGGCACCTGGGTCGGCTTCCCCAC 

TGGTctTllcCGGCCGGATCGTCGAGGGTGATGTTGCCATCATGATGGTCGAAGCCGAGGCCA 

CCGMAACG^OCTCGAGCTCG7CGAAGGTGGTGCCCAAGCGCCGACGGAAAGCGTGGTGGCC 

S^TGGAGGCGGCCAAGCCGTTTATCGCCGCGCTGTGCACCGCGCAGCAGGAGCTTGC 

CGA^^C^GCTOGMAGTCGGGCAAACCGACCGTCGACTTCCCGGTGTTCCCTGACTACGGCGA 

A^CGTOTAC^WTTCGGTGTCCTCGGTGGCCACCGACGAGTTGGCC^CCGCGTTGACCATCGG 

CGGT^GCCGAGCGCGACCAGCGCATCGACGAAATCMGACCCAGGTTGTGCAGCGGCTCGC 

^ACA^^CGAGGGTCGCGAAAAGGAGGTCGGCGCCGCGTTGCGTGCCCTGACCAAAAAGCT 

GGTOGGC^CGCATCCTCACCGACCATTTCCGTA7CGACGGCCGC 1 3GCATCACCGACATTCG 

CGCA^GTCGG^CGAGGTGGCCGTGGTTCCGCGCGCGCACGGCAGCGCGC7GTTCGAACGCG 

S^CCCAGATCCTGGGTGTGACCACACTCGACATGATCAAGATGGCCCAGCAGATCGAOT 

CGTTGGG^C^GGAGACATCGAAGCGGTACATGCACCACTACAACTTCCCGCCGTTCVrC^ACCG 

gggIgaccggtcgggtcggttcgcccaagcggcgtgagatcgggoagggcgoactggccga 
gSgSggtgccggtgttgccgagcgtggaggaattcccgtatgccattcggcaggtgtg 

GGAGGCTCTGGGCTCCAACGGGTCGACCTCGATi 
TGCTCAACGCCGGGGTGCCGCTCAAGGCGCCGG1 

g!cga^gaagtagaaggggcggtcgacggcgttgtggagcgtcgcttcgtcaccct^ 

q^ATCCTCGGCGCCGAAGACGCGTTCGGTGACATGGACTTCAAGGTCGCCGGGACCAAGGAC 

121 



GGGGTCGGTGTGCGCGTCGACGCTGGCGC 
TGGCCGGCATCGCGATGGGCCTGGTCTCC 



0135317A1 I > 



WO 01/35317 



PCT/USOO/31152 



10 



TTCGTCACCGCGCTGCAGCTGGACACCAAGCTCGACGGGATCCCTTCGCAGGTGCTTGCCGGA 

GCACTCGAGCAGGCCAAGGACGCCCGCCTCACGATCTTGGAGGTGATGGCTGAGGCCATCGAT 

AGACCCGACGAAATGAGTCCCTACGCCCCGCGGGTGACCACCATCAAGGTTCCGGTGGACAAG 

ATCGGGGAGGTCATCGGACCCAAGGGCAAGGTCATCAACGCCATCACCGAGGAGACCGGCGC 

GCAGATCTCCATCGAAGACGACGGCACCGTGTTCGTCGGCGCCACCGACGGGCCATCGGCACA 

GGCCGCGATCGACAAGATCAACGCCATCGCCAACCCGCAGCTGCCGACGGTGGGCGAACGGT 

TCCTCGGAACCGTGGTCAAGACCACCX3ATTTCX3GTGCCTTTGTATCGTTGCTGCCTGGCCGCGA 

CGGTCTGGTGCACATTTCCAAACTCGGCAAGGGCAAGCGCATCGCGAAGGTCGAGGACGTTGT 

CAATGTCGGTGACAAGCTGCGGGTGGAGATCGCCGACATCGACAAACGGGGCAAGATCTCCCT 

GATCCTGGTCGCCGACGAGGACAGCACCGCCGCCGCTACCGATGCCGCGACGGTCACCAGCT 

GA 



15 



>Rv2793c truB tRNA pseudouridine 55 synthase TB.seq 3102364:3103257 MW:31821 
>emb|AL123456|MTBH37RV:c3103257-3102361, truB SEQ ID NO:107 

ATGAGCGCAACCGGCCCCGGAATCGTGGTTATCGACAAGCCCGCGGGAATGACCAGCCATGAC 
GTGGTGGGGCGGTGCCGCCGCATCTTCGCCACCCGGCGGGTCGGCCACGCGGGCACCCTGG 
ACCCGATGGCCACCGGGGTGTTGGTGATCGGCATCGAACGCGCCACCAAGATCCTCGGTCTGC 
TGACGGCGGCCCCCAAGTCGTATGCCGCCACCATCCGCTTGGGTCAGACCACTTCCACCGAGG 
ACGCCGAAGGTCAAGTGCTGCAGTCGGTTCCGGCTAAGCACCTGACCATCGAGGCGATCGACG 
20 CCGCGATGGAGCGGCTGCGCGGTGAGATCCGGCAGGTGCCGTCGTCGGTCAGCGCGATCAAG 
GTCGGTGGCCGACGCGCCTATCGGTTGGCCCGCCAGGGGCGCTCCGTGCAATTGGAAGCCCG 
GCCGATCCGCATCGACCGGTTCGAGCTGCTGGCCGCACGCCGGCGCGACCAGCTCATCGATAT 
CGATGTGGAGATCGACTGCTCCTCGGGAACCTACATCCGCGCGTTGGCACGCGACCTCGGCGA 
CGCGCTTGGGGTGGGAGGCCATGTGACGGCGTTGCGGCGCACCCGCGTCGGCCGCTTCGAGC 
TGGACCAGGCGAGATCGCTCGACGATCTCGCGGAGCGCCCCGCGCTGAGCCTGAGCCTCGAT 
GAGGCCTGCCTGCTGATGTTTGCGCGCCGCGACCTGACCGCCGCGGAGGCCAGCGCGGCCGC 
CAACGGCCGGTCCCTGCCGGCGGTCGGTATCGACGGCGTGTACGCGGCCTGTGACGCCGACG 
GCCGGGTTATCGCGCTGCTGCGTGACGAGGGTTCGCGGACCAGGTCGGTGGCGGTGCTCCGC 

CCGGCGACGATGCACCCCGGGTAG 

30 

>Rv2797c - TB.seq 3105619:3107304 MW:58761 >emb|AL123456|MTBH37RV:c3107304-3105616. 
Rv2797c SEQIDNO:108 

GTGCCACTGACCGTGGCCGATATCGATCGGTGGAACGCGCAAGCGGTCCGGGAGGTGTTTCAC 
GCGGCCAGTGCCCGAGCGGAGGTGACGTTCGAGGCGTCGCGTCAGTTGGCCGCGCTGTCGAT 
35 TTTTGCGAACTCGGGTGGCAAGACCGCTGAGGCGGCGGCACACCACAACGCGGGCATTCGCC 
GAGACCTCGACGCCCACGGCAACGAGGCGTTGGCGGTTGCCCGGGCGGCCGACAGGGCCGC 
CGACGGGATTGTGAAGGTTCAGTCCGAGCTGGCCGCACTACGCCATGCCGCCGCGGCCGCCG 
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INSDOCID: <WO 0135317A1 I > 
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WO 01735317 

AGCTGACGATCGATGCGCTGATCAACCGGGTGGTGCCGATCCCCGGGCTGCGATCCACCGAG 

GGCCGAGGCCAATGCCGTCGACGAGGAGCTGGCCTCAGCGGTC 
CGCC^TCCCGGCCGATTCCGGCCCG^^ 

GCCAGC^^TGCCAACGAGGAGCGGCTGCGCGAGGAGCGCGCCCGCCTGCAGGCCCACC^CG 
AGCGG^ACAGGCGGAGTATGACCAACTGAGTGTGCGGGCCGCCCGTGACTACCACAACGGCA 

GACCC^C^^^G^^CCCGAGGACCCAAATCAGCAGGTGCTGGCGGCCGTGGCCGTCGGTAA 
TCCCGACAC^ 

CCC^v^— 5CCTGGATGGGCTACCACCCGCCCCCGAACC 

CACTCGACACCGGCAGTGCGGGCGATCTGTGGCAGACCATGACCGATGGGCAGGCACACGCG 

jGGTGCGCGCCAATAACCCCAGTGGCCACCTG 



10 iuuus^^ TGCGGTCGGAGGTAATCCGGCAACTCAATG 



JCTGCCCGGCATGGTGACCGAAGCCCGCGACCTC 
CTGCCGGCAAGCCCGCATCGGTTGCCACCATCGC 



GGCGCGGCCGATCTGTCGCGGTATTTGCAGCAi 
15 ft u^i«n«^ rACGGCTCACCCGGCTTGGAGCTGTACAGC 



ACCGTGTTGGGGCACTCGTATGGGTCGCTi 



GACGGCGTCGCTGGCGTTGCAGGACCTCGATGCC 



CAGAGCGCCCATCCGGTCAACGACGTCGTG1 ^ atgtcat ^ agqccccccacgaoctcatc 



CCGGCGCAGCTCGGGCTCGATCACGGGCACGC 
. . .^^^-r^rtorsnraTTGGCGCCGCTGCAC 

rGATCCGGGCGGGATCTGGCGTGACGGAGT 



ACCAATCTGGTGGCGCCGTTGGCGCCGCTGCACG 
GGGTTCACGGAGCTGTCGTCACAGGCGGG 



GATGGGGCCTGGACCCCTATCTGACCCCC 



GTATGCCCACGGGGACTACCCGCGGTCCTTCCTCGATGCCGCCGGCCAGCCGCAGCTGCG^GA 

:GGCGATCGCC< 

3CCGGCAGCGCCCGGCCCAGCACTGAGAGGGGG 



20 ?S ' — — .TCGCCGCCGGGCTGCCCGACAACACGGTGGGCCCGCCG 



TGTCCGGCTATAACCTGGCGGCGAT 
CTGCTTCCGCCAATTCTGGGTGGCGGCATGC 

ACGTTGA 



« > R v 28M cponA2TB. Mq 31754 5 4:317726aMW:630,5 ^ m b|AL12345 6 ,MT B H37RV:=3177262- 
-VI7S451 Rv2864c SEQIDNO:109 

atxmtaa^aIaacaacattagcctcagccacctcaggtttgctgctgcttgcggtcgtcgccat 

OTCGGGCTBCA^CCCGCGTCCCCAAGGGCCCGGTCCGGCGGCCGAAAAGTTCTTCGCCGCOC 

^Igcca^gtg^accgcctccgccgcccagctcagcgagaaccccaacgaggcgc^ 
» gcgctgaacgcg^^ctgggcggggctgcaggccgcccacctggatgcgcaggttctcagcgc 
^a^ccgLgacaccggtagggtggc^atgggttcaggtgggatctgggc^ 
a^ctggagctatgacggccagctgaagatggcgcgcgacgaagggcgttggcacgttcgctc 
gacca^;^gcgg^t^gcatcccaagctaggcgaacatcaaacgttcgcgctacgagccgaccc 

^Sg^gggtggg™^ 
35 atcw^tactcgctggacgccggccaggccggccgcgagctcttcggcacggcacaggcggtg 

gtg^^cg^cgctgcaccccttcgacgacacgctcaatgatccgcagctgctggccgaacaggcc 
agctcgtcgacccagccgttggacctggtcacgttgcacgcggacgacagcaaccgggtggc 



8NSDOClD:<WO 0135317A1 I > 



PCT/US00/31152 

WO 01/35317 



,CGATCGGGCAGCTGCCTGGCGTGGTGATCACACCGCAGGCCGAGCTGCTCCCGACCG 

kCGATGTCAAG> 
JTCAACCAAAAl 
3TTTCGATCAD 
X3GCAAGGCGA 

CGCCGGGGCCGATGCGGACGGTCCGGTCGCGACCA 



^CCAAAATGGCGTCGACGTCTCGGTGCTGCAC 

AAG GUww w i i oouovjw i ww • ~- 

* a ^rT^rcTnnTCGGl I f( 

GA< 



^^rA^Gcl^^ATCACGTTGGATCGGGTCGTGCAAAAOGOC 
GCGCAACAC^GOT^^C^C^GGGGCGGCAAGGCGATGATCGTCGTGATCAAGCCGTCGAC 



ttcccaactacggtggctttgatctgggc ctg{ ^ ccccgC(3GTCTGACTCAGGCGG cco 



, 0 caacaccacc^cgcccag^ 

GCACCGAGGACGGTTTCGGCCAGGGCAAGGTGCTGGC 
GCGGCGACGGTAGCCGCCGGGAAGACCCCGGTTCCAC 
AGCTGATCGCCGGCCGGCCGACGGCCGTCGAAGGCGATGCCACACCGATCAGCCAGAAGATG 



GGCGGTACGGGATCGGGCTTGACTACCAl 
CCGCCGACGGTGGACCTGGCCGAACGCACCGAGG 
CAGCCCGTTCGGCATGGCCTTGGTGGCGGCGACGC 



15 A 



JCGACGCGCTGCGGCCCATGATGCGGTTGGTGGTGACCAATGGCACCGCCAAGGAGATCGCT 
GGCTGTGGCGAGGTGTTCGGTAAGACCC _ JCGTCGGGGGCGGTAGCTCGGA 



rAAGACCGGCGAAGCCGAATTCCCGGGCGGATCGCATTCCTG 



20 



25 



^^^^ 
3179365, ^^^^^^^^^^^q^^qq^qq^qqqqqcACCCACGCTCGCTCCCCGGCGCGCCAC 

ATCGCCCGGCACAGCCAGATCCCGGTAGTCGCC 
CCGCCATCGACGCTGGATGTGCCGCGGTGCGGGTU 
GCCGGGTGGGTGAGGTCGCCAAGGCGGCGGGTGC< 

CAACGCCGGTTCGCTGGACAAACGGTTCATGGAG^A^^^^^^^^^^^^^^^^^^^^ 



GGACATACATTTCCAGCCGCGCTACATATTCG 
CGGGTCAACCCGGGCAACATCAAGGAGTTTGACG 



GCGGCCGGGATCCCGATCCGAATCGGTGT 

M ^cc^cga^^^^^ggggcctggatggtctcgatgtgccgttgcgggtggc^tgatgg 
ggtgtg^og^caatggtccgggtgaagcacgtgaggccgacctgggcgtggcgtccggcaac 

124 



BNSDOCID: <WO 013S317A1 I > 



PCT/USOO/31152 

WO 01/35317 

GACACCGAGCGGTTCGCCTATTGTGACCGTAAGCTGA 

, B » 2 e690.TB.se< 1 3180548:3131759 MW:42835 >^123456, M TBH37RV:c 3 iai759-318054S. 

RV2869C S6QJD NO:,11 TCGCSATCCTG ArrTCGGTGGCCCTGCACG 

AT ^^^^^^^^^^^^q^qqqCC^^CCGGGATGAAGGTACGTCGCTATTTCGTCGGCT 
AATGTGGTCACATGT^GT^^GC T ^ GTGT ^^^ TC ^ 

TTGGCCCCACGTTG^^TCG^O^G^CGC^^^^^^^^^^^^^^^^^^^^^^ 
TGGGCGGCTTCTGTGACATCGCCGGCAT ^ a ^ OTA „ COCCGGGCCCGGA A 



ACCGTGCGATGTACAAGCAGGCCACCTC 



TGCCATCGCGCTGGTCTGGGGGGTGCCTAACCT 



CCCGATTAGGCTTTTCCAGTAG 

, B v2870c.T B .s„31 8 1770 : 3 1 83077MW : 45324 >en*,AL,23456,MT B H37RV : =3ie3077-31817 6 7. 

RV2870C ^^«!macgcGTCGTGATCCGGCGGCGCGGTGACAACGAGGTGGTGGCGCACAA 
GTGGOTACGGGTGGACGCGTCGTGATCC -TGACGGCCGGTTGCGGGTGGTGGTGCTGG 



rCGGGCTGGCCGCTGGCGGCGCCCATCTGGACACGTTGCTGCGACAACGTGCGCAG 



^==^^CA T CG^CAA T CGG 6 AC CaTTOG 
AGGTAGTC^^ , ^^"™^"" C ^ GCACGCGG CGCAGCGGGTCGGCGA(^TCCC 

125 



013S317A1 I > 



PCT/USOO/31152 

WO 01/35317 

(5CCCGGTCAGATCGTGCCGGTCGACTCCGAACACTCCGCGCTGGCCCAGTGCCTGCGCGGCG 

^ACTCCCGACGAGGTCGCCAAGCTGGTGCTGACGGCCTCGGGAGGGCCGTTTCGGGGCTGG 

T^GCG^^^^^CTCGAGCATGTCACCCCCGAGCAGGCTGGCGCGCATCCTACGTGGTCGATG 

G^^^^^GAACACGCTGAATTCGGCGTCGCTGGTCAACAAGGGACTTGAGGTCAT<^AAACC 

^^rT^TGTTCGGCATCCCCTACGACCGCATCGATGTCGTGGTGCACCCCCAGTCGATCATCC 

ATTra^^TCACCrrCATCGACGGTTCGACGATCGCCCAGGCCAGTCCCCCGGACATGAAGCT 

AC^A^TCG^AGCGCTGGGCTGGCCGCGTCGGGTCAGCGGCGCCGCTGCTGCCTGTGATTT 

C^A^V^AGCTGGGAGTTCGAGCCGTTGGACACCGACGTCTTCCCCGCGGTCGAGTT 

GGC^G^^^G^CGGCGTAGCCGGTGGCTGCATGACCGCGGTTTACAATGCGGCGAACGAAG 

!!r^G^C^GTTCCTTGCTGGCCGGATCGGCWCCCGGCCATCGTCGGCATCATCGCCG 

ACGTCTTOCAO^TCCCGACCAATGGGC<^TCGAACCCGCTACCGTGGATGACGTACTC<^^G 

^^T^GCCCGCGAGCGAGCGCAGCGCGCGGTATCTGGTATGGCTTCGGTGGCGATC 

GCAAGCA^GGCGAAGCCGGGCGCAGCGGGTCGACACGCATCGACGTTAGAAAGGTCCTGA 

^922030,0 m em«,O.S m c1/Cu O /Cu«14 ta n«VTB.s eq 32 3 41 8 9:32 M 055MW:139610 
k.ai i^«56IMTBH37RV:c3238055-3234186. smc SEQ ID NO:1 13 

r^nrGCTGCAG^ 
rrGG^ACGCGGC^ 

Sgt^ag^gctg^^ 

^^~qq«q^qYqqjCGCAGTCGACGGAGCAGATCACCGGGTTCAGCGAGCGGTAATCTGGCC 

cgtcgccgactggtcgcag (AAGGGCTTC aagtccttcgccgcgccgacgacttta 



cctcgtgtacctcaagagtctgacgttg; 
cgcttcgagccgggcattacggccgtcg 

GAT© 



TTGGGCCCAACGGCTCCGGCAAATCCAATGTGGTC 



CCCTGGCGTGGGTGATGGGGGAGCA* 



GGGGGCAAAGACGCTGCGCGGCGGCAAGATGG 



AAGACGTCATCTTCGCCGGCAO 



CTCGTCGCGTGCGCCGCTGGGCCGCGCCGAAGTCACCGTTA 



GCATCGACAACTCCGACAACGCACTGCC 



TATCGAATACACCGAGGTGTCGATCACCCGAAGAAT 



CGAGGAGATCTTGCAGTCGCGGCCTGAGGAT 
TGCTCAAGCATCGCAAGCGCAAGGAAAAAGCTCT 



^xTTrrCGACGGTGCCAGCGAATACGAAATCAACGGCAGCAGTTGCCGTTTGATGGATGTGCA 
^GCTGA^ 

GGAGTTGCTGAGCGACTCO \TCGGCGGGCGTTCATCGAGGAAGCCGCCGGTG 

GCGCAAACTCGACACGATGGCGGCGAACC 

CGCG^^GAGTC^GGGGTCGCCGAACTCTCGACGCGGGCCGAGTCGATCCAGCACACTTGGTT 

SS^tgcggtggccgaaggggtggacgctacggtgc^catcgcoagcgaacgcgcc^ 

Axr ATrTCGATATCGAGCCGGTAGCGGTCAGCGA 

^~cagcaggtggccgtcgccgagcmcaactgttagcggagctggacgcg<^ 

126 



.tctcgatatcgagccggtagcggtcagcgacaccgaccccagaaagcccgaggagctag 
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CCGACCGGGCACACCTGGCGGCG^CC^A^AG^AC AGAGCG TGGCAC 
GGTTGTCCGAGCGGATCGAGGATGC GCGAGGTCGGCC TGGATGAGCACCACGA 
^^^^"^J^^Ce escTCGCATCGATGCGCTCGCAGTGGGGCTA 

^.y^^^^Q^^^^^^^^^^^ACGTTCCGGCTATGAAGCGGCACTGGCCGCGGCGCTCGGGC 
ie ^q^^^^^^^^GCACTTGCGGTGGACGGCCTGACTGCCGCGGGTAGTGCCGTCAGCGCACTC 

CCAATCC<*^.^— XTGCTTTCGGGTGTCGCGGTGGTCAACGACC 



CTCCACCGCAGTTGGTTGGCGCGATGATCGCCA- 



TGA< 



CTGAGGCAATGGGCCTGGTCGAGATTCGTCCGGA. 



GCTACGCGCGGTCACCGTTGACGGTG 



15 



ATCTGGTGGGCGCCGGCTGGGTCAGCGGCGGA 



JCGGACCGCAAGCTGTCCACCTTGGAGGTC 



ACCTCCGAGATCGACAAGGCCAGGAGTGAGCT 



GGCCGCTGCCGAGGCGCTGGCGGCGCAATT 



GAATGCGGCCCTGGCCGGTGCGCTGACCGAGCA 



GTCCGCCCGCCAGGACGCGGCCGAGCAA 



GC' 



CTTGGCCGCGCTTAACGAATCCGACACGGCCATC 



TCGGCGATGTACGAGCAGCTGGGCCGC 



CTCGGGCAGGAGGCCCGCGCGGCGGAAGAAGA 



GTGGAACCGGTTGCTGCAGCAGCGTACGGA 



CGTCATACAACTTGAGACCCAGCTGCGTAA 



CGCGCCAACGCGGTTC 



TCGCGGCCGCGGTGTCGAAGTGG^GOCCGGCTGGCGGTGCGCACCGCCGAGGAA 
AGGCGCGGGTGCGGGCTCAGCAAG^CGCG^GCAAGACTGCATGCGGCCGCGGTGGCCGC 



25 



AGCGGTCGCCGACTGCGGACGGCTGCTGGCCG 



GGCGGTTGCACCGGGCGGTGGACGGGGCG 



;gcaactgcgcgacgcgtcggccgcgcaacgtca< 



GCAGCGGTTAGCGGCGATGGCCGCGGT 



^^^^^^^^^^^QACGCTGAGCGCCCGAGTGGGGGAACTCACCGATTCGCTGCACCGCG 
AGGAGCTGGCTAACGCGCAGGCGGCGCTGCG^^CGAGG^^^^^^^^^^^^^^ 



GCTTGAGCAGATGGTGCTAGAG 



CAGTTCGGAATGGCGCCGGCCGACTTGATCA 



30 



ACCGAGCTCGAGATGGCTGAGTTCGAG 



CAAGCCCGCGAACGCGGCGAGCAGGTGATTGCGCC 
CGCCCCCATGCCGTTCGACCGGGTTACCCAGGAGCGCCGGGCCAAACGCGCCGAGCGTGCGC 

cg^tg"Ja^ 
3s a^cgcggcg™ 

onArv^CATGCTCACCACCGGCATCGAGGTCGAAGCCCGCCCGCCGGGCAAGAAGATTACCC 
GACT^G™ 
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TTCGTGCCCGTCCATCGCCGTTCTACATCATGGACGAGGTGGAGGCCGCCCTCGACGACGTGA 
ACCTGCGCCGACTGCTCAGCCTGTTCGAACAGCTGCGAGAGCAGTCGCAGATCATCATCATCAC 
CCACCAGAAGCCGACGATGGAGGTCGCGGACGCACTGTACGGCGTAACCATGCAGAACGACG 
GCATCACCGCGGTCATCTCGCAGCGCATGCGCGGTCAGCAGGTGGATCAGCTGGTTACCAATT 

CCTCGTAG 

>Rv2925c mc RNAse 111 TB.seq 3239829:3240548 MW:25400 
>emb|AL123456|MTBH37RV:c3240548-3239826. mc SEQ ID NO:1 14 

ATGATCCGGTCACGACAACCCCTGCTCGACGCACTCGGTGTGGACCTCCCGGACGAGCTGCTC 

TCACTGGCGTTGACCCACCGCAGCTACGCCTACGAGAACGGCGGGCTGCCGACCAACGAGCGT 

TTGGAGTTTCTCGGCGATGCCGTGCTAGGGCTGACCATCACCGACGCGCTGTTCCATCGTCATC 

CTGATCGGTCGGAGGGGGATCTGGCCAAACTGCGGGCCAGCGTAGTCAACACCCAGGCCCTG 

GCCGACGTCGCACGCCGCCTCTGTGCGGAAGGCCTCGGTGTTCACGTGCTATTGGGTCGCGGC 

GAGGCGAACACCGGCGGGGCCGACAAGTCCAGCATTCTGGCCGACGGTATGGAATCGCTGCT 

GGGCGCGATCTACCTGCAACACGGTATGGAGAAGGCCCGTGAGGTGATCCTGCGGCTGTTTGG 

CCCGTTGCTGGACGCCGCGCCGACCCTGGGTGCGGGATTGGATTGGAAGACCAGCTTGCAGG 

AGCTGACTGCAGCGCGAGGGCTGGGTGCGCCGTCATACCTGGTCACCTCCACCGGCCCGGAC 

CACGATAAGGAATTCACCGCGGTGGTTGTCGTGATGGACAGCGAATACGGTTCAGGAGTGGGC 

CGGTCCAAAAAAGAAGCCGAGCAAAAAGCCGCGGCGGCCGCTTGGAAAGCCCTGGAAGTGCTC 

GACAACGCCATGCCGGGCAAAACCTCCGCCTAA 

>Rv2934 ppsD TB.seq 3262245:3267725 MW:193317 
>emb|AL123456|MTBH37RV:3262245-3267728. ppsD SEQ ID NO:115 

ATGACAAGTCTGGCGGAGCGCGCGGCGCAACTGTCGCCGAACGCGCGAGCGGCCCTGGCGCG 

CGAGCTCGTCCGTGCGGGTACGACCTTCCCGACCGACATCTGCGAGCCGGTGGCGGTGGTGG 

GCATCGGCTGTCGCTTTCCGGGGAATGTGACTGGGCCAGAGAGCTTTTGGCAGCTACTGGCCG 

ACGGTGTGGACACAATCGAGCAGGTGCCGCCTGATCGGTGGGATGCGGACGCGTTCTACGATC 

CCGATCCTTCGGCGTCGGGTCGGATGACGACGAAATGGGGTGGTTTCGTTTCCGATGTCGACG 

CGTTCGACGCCGACTTTTTCGGAATCACTCCTCGGGAAGCCGTGGCGATGGACCCGCAGCATC 

GGATGCTGCTCGAGGTTGCCTGGGAAGCGTTGGAGCACGCGGGTATTCCGCCGGATTCCTTGA 

GCGGCACTCGAACCGGCGTGATGATGGGTCTGTCGTCGTGGGACTACACGATCGTCAATATCG 

AGCGCAGAGCCGACATCGACGCGTACCTGAGCACCGGAACCCCGCACTGTGCCGCGGTGGGG 

CGGATCGCGTATCTGTTGGGATTGCGTGGTCCGGCCGTCGCCGTAGATACCGCTTGTTCGTCGT 

CGCTGGTGGCAATTCACTTGGCGTGTCAGAGCCTTCGCCTGCGTGAAACCGACGTGGCATTGG 

CGGGCGGGGTGCAGCTCACCTTGTCACCGTTCACCGCCATCGCGCTGTCCAAGTGGTCGGCGC 

TGTCACCGACCGGCCGATGCAACAGCTTCGACGCCAACGCGGATGGATTCGTGCGCGGCGAG 

GGCTGCGGCGTGGTGGTGCTCAAGCGGTTGGCCGACGCGGTGCGCGACCAGGACCGGGTGCT 

128 
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TGCGGTGGTCCGCGGTTCGGCAACTAACTCCGATGGTCGC3TCCAACGGCATGACCGCACCGAA 

CGCGCTGGCGCAGCGTGACGTGATCACATCCGCCCTCAAGCTTGCGGATGTTACCCCTGACAG 

CGTGAACTATGTCGAAACACACGGCACCGGAACGGTGTTGGGGGACCCCATCGAGTTCGAGTC 

GCTGGCGGCCACTTATGGCCTGGGTAAAGGCCAGGGCGAGAGCCCX3TGCGCATTGGGGTCGG 

TCAAGACCAACATCGGCCACCTGGAGGCGGCCGCCGGTGTGGCTGGATTCATCAAGGCGGTGC 

TGGCGGTGCAACGTGGGCACATTCCCCGCAACTTGCACTTCACCCGGTGGAACCCGGCCATCG 

ACGCGTCGGCGACGCGGCTGTTCGTGCCGACCGAAAGCGCCCCGTGGCCGGCGGCTGCCGGT 

CCACGCAGGGCTGCGGTGTCATCGTTCGGCCTCAGCGGGACCAACGCGCACGTGGTGGTCGA 

GCAGGCACCCGACACCGCAGTAGCCGCAG<XGGCGGCATGCCGTATGTTTCGGC(K:TGAACG 

TCTCCGGCAAGACGGCCGCGCGGGTGGCGTCGGCGGCGGCGGTGCTGGCCGACTGGATGTC 

GGGGCCGGGCGCGGCGGCACCACTGGCCGACGTGGCACACACGTTGAACCGGCACCGGGCC 

CGGCACGCCAAGTTCGCCACCGTCATCGCGCGTGACCGCGCCGAGGCGATCGCGGGGTTGCG 

AGCGCTGGCGGCCGGACAACCACGCGTTGGGGTGGTGGATTGCGACCAGCATGCCGGTGGGC 

CTGGCCGGGTTTTTGTGTATTCGGGTCAGGGCTCGCAGTGGGOSTCGATGGGCCAGCAG-TTGC 

TGGCCAACGAACCGGCGTTCGCCAAGGCGGTAGCCGAGCTGGATCCGATATTCGTTGACCAGG 

TTGGCTTTTCGCTGCAGCAAACGCTTATCGACGGCGACGAGGTGGTGGGCATCGACCGCATCC 

AGCCGGTGCTGGTCGGGATGCAGTTGGCGCTGACCGAGTTATGGCGGTCCTATGGGGTGATTC 

CAGA^3^^GTGATCGGGCACTCGATGGGTGAGGTGTCGGCGGCAGTGGTGGCCGGCGCGTTG 

ACGCCCGAGCAGGGCTTGCGGGTCATCACCACCCGGTCGCGGTTGATGGCGCGGCTGT(^^^3 

GCAGGGAGCGATGGCGCTGCTCGAGCTGGATGCCGACGCCGCCGAGGCGCTGATTGCCGGCT 

A^^<^£A&^TC3ACGCTGGCGGTGCATGCGTCACCGCGCCAGACGGTGATCGCCGGGCCGCCC 

g^tLacacggtgatcgcggcggtagcgacgoaaaaccggttggcgc^ 

AtTCGA^TGGCCTOCCATCACCCGATCATCGATCCCATACTGCCCGAGTTGCGAAGCGCGTTA 

gcggmttgactccgcagccgccgagcatcccgatcatttccactacgtacgaaagcgcgcag 
SggCgTcggatgccgactattggtcggccaacc^^ 

gtcaccgccgccggtgtcgaccacaacaccttcatcgaaatcagccctcaccccgtgctcacg 

cacgcactcaccgacaccctggatccggacggcagccatacagtcatgtcgacgatgaaccgc 

gaaLgaccagacgotgtatttccacgcccaactogccgcggtcggtgtggctgcg^ 

cacaccaccggtcgccttgtcgacctgccccccacaccgtggcaccatcagcgattctgggtc 

acggItcg^cggcgatgtccgagctggccgcgacccacccgctcctgggcgcgca^ 

gIt^ggcgcaacggagaccatgtctggcagaccgatgtcggcaccgaggtctgtcc^ 

ggcagaccacaaggtgttcggtcaacccatcatgccggccgcggggttcgccgagatcgcctt 

ggcggcggccagcgaagccctcggcacagccgccgacgccgtcgcacccaacatcgtgatca 

acc^gttcgaggtggagcagatgctgcccctcgacggccacaogccgctaacgac^ag^aa 

ttcgcggcggggacagccagattcgggtcgagatctattcccgcacgcgtggcggagagttct 

gccg^acgccacggccaaggtt<^caat^^ 

gcccaagg1 
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3TCCCGCCACCGGGACAACAGTGTCGCCGGCCGATTTTTATGCCCTGCTCCGCCAA 
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ACCGGCCAACACCATGGTCCGGCGTTCGCGGCCTTAAGCCGGATCGTGCGCCTGGCCGATGGT 
TCCG^GGAWCCGAGATCAGCATTCCCGACGAGGCGCCGCGCCATCCCGGGTATCGGCTGCA 
CCCCGTGGTATTGGATGCGGCATTGCAAAGCGTGGGTGCCGCGATACCCGACGGCGAGATCGC 
GGG^C^^GCCAGCTATCTGCCAGTGTCGTrGGAGACCATCCGGGTGTACCGCGACAT 
s CG^TCGGCA^STCAGGTGTCGTGCCCACCTGACAAACCTCGACGGCGGCACCGGAAAGATGG 

gc!gg!^cctaatcaacgacgccg^^ 

GTCG^GTCGAACGCCGTGCGGTACCCCTGCCACTAGAGCAGAAGATCTTCGATGCCGAATGGA 

ccgamgcccgatcgcagccgtgccggctccggagccagctgccgagacgacgcggggaagt 

TOGCTG^GTACTCGCCGATGCAACGGTGGATGCGCCAGGCAAGGCCCAGGCCAAGTCGATGGC 

,o ^gIcttcgtgcagcagtggcgctcaccgatgcggogggtgcacaccgcogatatccacga 
cg^tcggcggtgctggccgcatttgcagaaacggcaggcgatcccgagcacccgccggttg 
^^gtgttcgtcggcggtggctcgagtcgactggaogacgaggtggcggcggcgcgc 
^^^cgqjgtggtcgatcaccacggtggttcgtgcggtcgtcggcacgtggcacggccgatca 

ccgcggct!tggctggtcagcgggggcggactttccgttggcgacgacga^ 

rorGGCGGCn'CCTTGAAAGGGCTGGTGCGGGTGCTCGCCTTCGAGCACCCGGACATGCGCA 
CGCGGCGGCTTCCTTGAAAGG qaAGACCCGCTGACCGCGCTGAGCGCGGAACTGCGGA 

MCGCGTGGCGCGGCGAGOGCAGGTTCGTC 



15 



CCACCCTGGTCGATCTGGACATCACAI 
ATra-r.GGGAGTGGGTCGCGCCATGATGACGTGAl 

G/^CGGCTGTCGCGCGCCACGATCGATGTATCCAAAGGGCATCCGGTGGTGCGCCAGGGAGC 



GTCGTACGTCGTCA^CGGCGGCCTCGGCGGTCTCGGCCTGGTCGTCGCTCGTTGGCTGGTGG 
„ ACCGC^GCG^^GGCCGGGTGGTGCTGGGTGGCCGCAGCGATCCCACTGACGAGCAGTGCAAC 
G^C^GtcTGCAGACCCGCGCCGAGATCGTGGTTGTCCGTGGCGACGTGGCATCGCO 
^^QQ^Q^^^AAAAGCTGATTGAGACGGCCCGACAGTCTGGGGGCCAATTGCGCGGCGTCG 
jqq^^^^3C^GCGGTCATCGAAGACAGCCTGGTGTTCTCTATGAGCAGGGACAACCTAGAAC 

25 



CTCGACTGGTGGGTCGGATrCTCTTC^CXGCTTCGCTATTGGGrrCTCCCGGGOAAGC^CCT 

acgCgtgcgccagcgcgtggctggacgcgctggtcggatggcgcagggcatccggcctgcc 



GGCCGCGGTGATCAACTGGGGTCCGTGGTCGGAGGTAGGCGTCGCCCAGGCCTTGGTGGGCA 
^G^^ATCAGTGTCGCAGAAGGCATCGAGGCTCTCGACTCATTGCTTGCCGCCGA 

Sgg^o^ggagtggctcggctgcgtgccgatcgggccctggtcgcattcccggaga 
M tccg^^gcatcagctacttcacccaggtggtcgaggagctggactcggcgggtgacctcggcg 

IcSc^OCCGACGCGC^GCCGACCTCGACCCGGGCGAGGCGCGGCGCGCGGTGAC 
CG^^MGTGTG^GCATCGCTGCGGTGATGGGCTACACTGACCAGTCGAOTGTCGAACC 
CGCCGTGCCCTTGGACAAGCCC^TGACCGAGCTGGGGCTGGATTCTCTGATGGCGGTACGAAT 

acg^acggcgcgcgggcggatttcggcgtggaaccgccggtagcgctgatactgcaaggcg 
35 cgtccttgcatgacctgacggcggacttaatgcgccaactcgggctcaatgatcc^^at^^gg 
^caacaacgctgacactattcgo 3 accg<^cgog^agcgcgcggca< 3 cgcgacacgga 



^gcggcgccgacctaaacctgaagtacagggaggataa 



GCCGCGA1 

130 



8NSOOCID: <WO 0135317A1 I > 



WO 01/35317 



PCT/US00/31152 



>Rv2946c pksl TB.seq 3291503:3296350 MW:166642 

kiai i7%d«56IMTBH37RV c3296350-3291500. pksl SEQ ID NO.116 



^GGCCAACCCAGGGCTGGATCCGATCGATGTGGGGTGCTCGTTGGCCAGTCGCTCGGTGTTT 

5CACCGAGCGGTGGTGGTCGGCGCAAC 
CGCGGCGGGCGAGCCGGGTGCCGGCGTGGCGG 



GAGCACCGAGCG^TGGTGGTCGGCGCAAGCCGTGAGCAACTGATTGCCGGGCTGGCTGGGCT 
° AGCACCG A 3CS ' ' 5TCGGTCAGCCAGGGTCGGTGGGCAAGACG 
GTGGTCGTGTTTCCTGGGCAGGGCGCGCAGCGCATCGGGATGGGCCGCGAGTTGTACGGCGA 
GTTGCCCGTGTTTGCGCAGGCATTCGATGCGGTGGCCGACGAGTTGGACCGGCATCTGCGGTT 
GCCG^"G^^GGACGTTATTTGGGGTGCCGATGCGGATTTGCTTGACAGCACCGAATTTGCTCAG 
CC^GCGTTGTrcGCGGTGGAGGTGGCATCGTTCGCGGTGTTGCGGGATTGGGGTGTGCTTCCG 

^^caTgggtcactccgttggagagctggcggcggcgcacgoggccggtgtgttgac 

GTTCG^GGACGCGGCGATGCTGGTGGTGGCGCGGGGCCGGTTGATGCAGGCGCTGGCGGCA 

ggcggtgcgatggtggcggtggctgccagtgaggacgaggtggagccgctgctgggtgagg 
gtgtg^^cgctgcgatcaacgcgcccgaatcggtggtgatctc^ 

GGAAA^G^^ATreCGGATCGGTTCGGCGCGCAGGGTCGGCGGGTGCACCAGTTGGCGGTCTC 

gca^g^cattcgccgttgatggagccgatgctcgaggagttcgcgcgtgtcgcggcccg 

GCTG^^G^ACGCGAGCCCCAGCTTGGGCTGGTGTGGAACGTGACGGGGG^GTTG^^^GGCC^ 

cLtttcgggt^gcgcagtactgggtggaccacgttcgtoggccggtgcgcttcgcggaga 

GT^GTCATTTGCAGACCCTTGGGGCGACCCACTTCATCGAGGCCGGCCCGGGAAGTGGTT 

TGA^^GCTCGATC^AGCAGTCCnTGGCCCCGGCTGAGGCGATGGTGGTGTCGATGCTGGGCA 

M^CCGG^^^^GCTGGCCTCGGCGCTCGGTGCTGCCGGTCAGGTGTTCACCACCGGTGTG 

^GCAGTGGTCGGCGGTGTTCGCCGGCTCGGGTGGACGGCGGGTGCAGCTGCCCACGTA 

TG^^G^ACGGCGGTT^GGGAGACGCCGGGGGCGGATGGGCCCGCCGATGCGGCGG 

GGTTGG^^n^G^CGCGACCGAGCATGCCTTGTTGGGTGCGGTGGTCGAGCGGCCCGATrCT 

QACGAGGTGGTGCTGACCGGCCGGTTGTCGCTTGCGGATCAGCCGTGGCTGGCCGACCACGT 

G^TGAACGGGGTGGTGCTGTTCCCCGGGGCGGGTTTTGTGGAGTTGGTGATCCGCGCCGGTG 

^XQAGGTCGGGTGCGCGCTCATCGAAGAGTTGGTGCTGGCCGCACCGTTGGTGATGCACCCGQ 

g^ggttcaggtgcaggtggtcgtcggggctgccgatgaatocgggoaccgtgcgg™ 

TCGGTGTATTCCCGCGGTGATCAATCCCAGGGTTGGTTGCTGAACGCCGAAGGCATGCTGGGG 

gtgg£g£gctgagacg^^ 

G^TATCTCGGACGGCTATGCGCAGTTGGCCGAGCGCGGTTATGCCTACGGCCCCGCGTTTCA 

ggXgg^c?atgtggcggcgggggtcggagctgttcgocgaagttgtaggocc^ 

^^GTGGCOGTCGACCGAATGGGGATGCATCCGGCGGTGTTGGACGCGG^T^T 

gccctcgggctggccgtcgagaagacccaggcgagcaccgagacgagactgccgttttgctg 

q^qtGG^OTGTCGCTGCATGCCGGCGGCGCTGGACGGGTGCGGGCCCGCTTCGCGTCCGCG 
GGCGCGGATGCGATTTCCGTGGACGTCTGCGACGCCACTGGGCTGCCGGTGTTGACGGTGCG 
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G^^^^AOSTTGGTGGTGTTGACCCATGGTGGCGTGGGGCTGGCTGGCGAGGACATCAGCG 
t^^CGCCGCGGTGTGGGGCATGGCGCGTTCCGCGCAGGCCGAAAATCCCGGCCG 

^^^qjq^^^G^3^^^^GACTGTGCACGCCCCCCGGCTGTCCCCGGCCCCGGCGTTG 
^q^^^^^CGGCC6TCG^^TCAACTTCC6CGATGTGGTGGCGGCCCTAGGGATGTATCCC^ 



GGCCAGGCCCCACCGCTGGGTi 



GCCGAAGGCGCCGGGGTGGTGCTTGAGACCGGTCCCGAAGT 



.CCGATCTTGCCGTCGGTGACGCCGTGATGGGATTCCTGGGCGGGGCCGGTCCGCTGGCGG 

&CCCGGGTGCCC 
GCCTGGTACGG< 
GGTACCGGCGG" 

CAGTGGGGCGTGGAGGTTrrCGTOACCGCCAGCCGTGGCAAGTGGGACACGCTGCGCGCCAT 



«*"*"~ ^CCCGGGTGCCGCAAGGCTGGTCGTTTGCTCAGGCAGCCGCTG 

JCGATTTAGCCGAGATCAAGGCGGGCG 
^TGCTGATCCATGCCGGTACCGGCGGTGTGGGCATGGCGGCTGTGCAGCTGGCTCGC 



15 T^CGCTGGT^TC^^^GTOTGGTACGGGTTGGCCGATTTAGCCGAGATCAAGGCGGGCG 
AATO 

ggg<^^ga^g^3Ga^c^at^g^^t^xo^Satgcgagttcgaggagaagttcctggc 

° G ^^^^GGGTOATGTGGTGCTCGACTCGCTGGCGGGTGAGTTCGTGGATG 

^Igg^^ctaattatccc<bgcgtgcagtatcgggcg^cgacctgtcggagg 

agccagg^ccgwiatatcggcaaggttgtcttaaccatgccctcggcgttggccgaccggctt 

^aS^^tgatcacgggtgccaccggggcggttggtgg^tgttggcc^ 
^^S^Statggggtgcgtcatctggtgttggccagtcggcggggcgatcgcgcgg 
^^^^g^cgccgacttgacggaggccggcgccaaggtgcaggtggtggc 

CTCTI^^Gre^^^ATCGCGCTGCGGTAGCGGGGTTGTTTGCCCAGCTGTCGCGGGAGTACCC 
30 ^^CGGGGTGATTCAT G CCGCCGGCGTGCTCGATGACGCAGTGATCACCTCG TO AC 

ACCGGACCGCATCGATACGG 

GCCGGGGCAGGGCAACTACTCGGCGGCAAACGCGTTTCTGGACGGGTTGGCCGCTCACCGGC 



ACCGGACCGCATCGATACGGTGTTGCGGGCCAAGGTGGACGCGGCGTGGAACCTGCACCAGG 
CCACCAGTGACCTGGATTTGTCGATGTTTGCGCTGTC 



rGCTCATCGATCGCGGCCACGGTCGGCTC 



\CTGGCGTGGGGTTTGTGGGAACAGCCTGGCGGCATG 
.rCGCGCATTTGAGCAGCCGAGATCTGGCCCGCATGAGCCGCAGCGGGCTGGCTCCGATGAG 

CrCTGCCGA^ 



AGGCCGCAGGGTTGGCCGGGATATCAC 

35 

\GACGCCCGGGCCCAGGCCGGTGCGTTGCCGGCGCTGT 



CACGCTCTTGGACCGGGCTGCACTAC 

132 
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TCAGCGGGCTCGCGCGCCGCCCACGCCGACGCCAAATCGACGACACCGGTGACGCCACCTCG 

TCGAAGTCGGCGCTGGCTCAACGCCTACACGGGCTGGCCGCGGACGAACAACTCGAGCTGCTA 

GTGGGGCTGGTGTGTCTGCAGGCAGCGGCAGTGCTGGGTAGGCCCTCCGCCGAGGACGTCGA 

CCCCGACACCGAATTCGGCGACCTCGGTTTCGACTCATTAACGGCTGTGGAGTTACGCAACCGC 

CTCAAAACCGCCACCGGACTGACGCTGCCACCTACCGTGATTTTCGATCATCCCACTCCCACTG 

CGGTCGCCGAGTATGTCGCCCAGCAAATGTCTGGCAGCCGCCCAACGGAATCCGGTGATCCGA 

CGTCGCAGGTTGTCGAACCCGCCGCCGCGGAAGTATCGGTCCATGCCTAG 

>Rv3014cligA DNAIigaseTB.seq 3372545:3374617 MW-.75258 
>emb|AL123456|MTBH37RV:c3374617-3372542. ligA SEQ ID NO:117 

GTGAGCTCCCCAGACGCCGATCAGACCGCTCCCGAGGTGTTGCGGCAGTGGCAGGCACTGGC 

CGAGGAGGTGCGTGAGCACCAGTTCCGTTATTACGTGCGGGACGCGCCGATCATCAGCGACGC 

GGAATTCGACGAGCTGCTGCGCCGTCTGGAAGCCCTCGAGGAGCAGCATCCCGAGCTGCGCA 

CGCCCGATTCGCCGACCCAGCTGGTCGGCGGTGCCGGCTTCGCCACGGATTTCGAGCCCGTC 

GACCATCTCGAACGAATGCTCAGCCTCGACAACGCGTTCACCGCCGACGAACTCGCCGCCTGG 

GCCGGCCGCATCCATGCCGAGGTCGGAGACGCCGCACATTACCTGTGTGAGCTCAAGATCGAC 

GGCGTCGCGCTGTCTTTGGTCTACCGCGAGGGACGGCTGACCCGGGCCTCCACCCGCGGCGA 

CGGGCGCACCGGCGAGGACGTCACCCTGAACGCCCGGACCATCGCCGACGTTCCCGAACGGC 

TCACCCCCGGCGACGACTACCCGGTGCCCGAGGTCCTCGAGGTCCGCGGCGAGGTCTTCTTCC 

GGCTGGACGACTTCCAGGCGCTCAACGCCAGCCTCGTCGAGGAGGGCAAGGCGCCGTTCGCC 

AACCCCCGCAACAGCGCGGCGGGATCGCTGCGCCAGAAAGACCCGGCGGTCACCGCGCGCCG 

CCGGCTGCGGATGATCTGCCACGGGCTGGGCCACGTGGAGGGCTTTCGCCCGGCCACCCTGC 

ATCAGGCATACCTGGCGTTGCGGGCATGGGGACTGCCGGTTTCCGAACACACCACCCTGGCAA 

CCGACCTGGCCGGTGTGCGCGAGCGCATCGACTACTGGGGCGAGCACCGCCACGAGGTGGAC 

CACGAAATCGACGGCGTGGTGGTCAAAGTCGACGAGGTGGCGTTGCAGCGCAGGCTGGGTTC 

CACGTCGCGGGCGCCGCGCTGGGCCATCGCCTACAAGTACCCGCCCGAGGAAGCGCAGACCA 

AGCTGCTCGACATCCGGGTGAACGTCGGCCGCACCGGGCGGATCACGCCGTTTGCGTTCATGA 

CGCCGGTGAAGGTGGCCGGGTCGACGGTGGGACAGGCCACCCTGCACAACGCCTCGGAGATC 

AAGCGCAAGGGCGTGCTGATCGGCGACACCGTGGTGATCCGCAAGGCCGGCGACGTGATCCC 

CGAGGTGCTGGGACCCGTCGTCGAACTGCGCGATGGCTCCGAACGCGAATTCATCATGCCCAC 

CACCTGCCCGGAGTGCGGTrCGCCGTTGGCGCCGGAGAAGGAAGGCGACGCCGACATCCGTT 

GCCCCAACGCCCGCGGCTGCCCGGGGCAACTGCGGGAGCGGGTTTTCCACGTCGCCAGCCGC 

AACGGCCTAGACATCGAGGTGCTCGGTTACGAGGCGGGTGTGGCGCTCTTGCAGGCGAAGGT 

GATCGCCGACGAGGGCGAGCTGTTCGCGCTGACCGAGCGGGACTTGCTGCGCACCGACCTGT 

TCCGAACCAAGGCAGGCGAACTGTCGGCCAACGGCAAACGGCTGCTGGTCAACCTCGACAAGG 

CCAAGGCGGCACCGCTGTGGCGGGTGCTGGTGGCGCTGTCCATCCGCCATGTCGGGCCGACG 

GCGGCCCGCGCCCTGGCCACCGAGTTCGGCAGCCTTGACGCCATCGCCGCGGCGTCCACCGA 
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CCGTCGAU,^^™ T _ TGGCCGGGCTGACCATCGT GGTCACCGGCTCGCT 



CCAGCTGGC^uuo , - \cAAGTGGCGGGCCGCCGGGGTGCGAATGGTCGAC 



rCGACTGGCACCGCGAGATCGTCGAC 
GGACCCGCGTCACGAACGTAA 



>RV3025C - NifS-Uke protein TB.seq 3383885:3385063 MW:40948 

u.a, ,^^iMTBH37RV c3385063-3383882. Rv3025c SEQ ID NO:118 



G 



CGGATCGAGGAGGCCCGTGAGCTGA" 



TCGCGGACAAGCTAGGCGCTCGTCCGTCCGAGGTGA 



TCTTCACCGCGGGCGGCACCGAAAGCGA 



.CAACCTGGCTGTCAAAGGTATCTATTGGGCACGCC 



^CCACCGAGGTGGAACACCACGCCGTACTG 
GCCCATGTGACCTGGCTGCCGACCGCCGC 



15 GCGATGCGGAGCCGCACCGCCGTCGCATCGTCA 

AC GGCCGCGCAGATCGCGGTGGACGGACTC ^ A ^ T ^ G ^ AACGGCGCCGATGA 



GGATCG 



TCTGGTCGAGGGTGTGCTGGCTGAGA 



25 



CCCGATGCGGCTAGCGGGTAACGCGC, 



ACTTCACTTTCCGTGGCTGCGAAGGCGATGCGCTGTT 



TGTTGTTGGACGCTAACGGAATCGAGTGCTi 



GA 



GCAGCCi 



CAACCGGATCGGCCTGCACGGCAGGTGTAGC 



CTCGCATGTGTTGATTGCAATGGGCGTCG 



A< 



CGCGGCCAGCGCCCGCGGATCATTGCG 



30 



TCTCTCGCTGGGGCACACCAGTGTTGAGGCTGATGTCGATGCCGCGTTGGAGGTGCTTCCCGG 
GGCGGTOGCACGTGCACGGCGGGCCGCCCTAGCCGCCGCGGGAGCATCCCGATGA 

, Rv 3080c PKOK p^n Hnas. TB.se, »«»•*«•• ^,,9420 

t..A. ^o-»>i«;rimtrH37RV c3445985-3442653. pknK SEQ ID NO:119 

3GGCGCGGCGGATTCGGCGTCGTCTACCGC 
.GCCCTCGCTGGACCGCGCCGTCGCCGTCAAGGTATTGAGCACCGACCTGGATCGG 



35 g^AATCTCGAGCGCTTCCT 

c'ggtgacggCgcaggtgggggtgttggcgggtgggcggcg^^^ 

134 



BNSOOCIO: <WO 0135317A1 I » 
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WO 01/35317 



iCCGTGACGTGAAGCCGGGGAATATCCTGCTGACCGACTACGGGGAACCGCAGCTGACCGAT 
T-rCGGAATCGCCAGAATCGCCGGGGGTTTCGAGACGGCGACCGGGGTGATTGCCGGTTCCCCG 



CGTCGACGAGATGCCCCTCCCCGTCGAGCTGGGCGTGGAACGCCGAGGCTCGCCCGAGGCGC 
ACG^CATCGGCATAGCGGCGGOGGCACCCCGACGGTCCCGACGCCTCCGACACCC<^ 



CCACGCCAAGAATTCGTTGGAGACGCTGATTCGCCGGCACGGGCCGCTGGACTGGCGCGAGA 
CA 

. q^^q^ccGCG^^^G^A^TCTCGAAGGAGCATCGCCGACGCCCGCCTCTGACGTGTACTCC 
CTGGGCGC^^3TTGTTCTGTGCGCTGACCGGCCATGCCGCCTACGAGCGCCGCAGCGGTGA 

^g^ccag^cctgcggatcac^^^ 

Sctg^gcggacgtggccgccgccatcgaacgggcgatggcccgccatccg^tcg^ 
^^^cggcagaggttggcgaggagcttggcgacgttcagcgccgcaacggcgtcag 

10 

^^^^^^^^^^^^^^oxgoowccggctcgctggtcacccgcagccggctcaccg^at 

^^CGGCGGACGGCGCCGGCTGATCCTCATCCACGCGCCCTCGGGATTCGGCA^ 

SI^g^gcgcaatggcgggaagagctctcgcgcgacggcgcogcggtggcctggct 
« ^caatcgacaacgacgacaacaacgaggtgtggttcttgtcgcacctgctcgagtcgatccg 

" gSSgcccacgctggccgag^ 

ccggccgctacgtgttgacttcgctgatcgacgaaatccacgaaaacgacgaccggatcgcgg 

tgg^atcc^^^ctggcatcgggtgtccgacagccgcacccaagctgccctgggtttcctgc 
tggagaa^gtcaccacctg^^ 

tgg^gttgoggatcggcgacgaactagcogagatcgattcggctgctttgcgcttcgata 
Sacgagg^^ 

^gcg^aotacctctaccgacgggtgggccgcggcgctgcggotggccgcgctgtcgct 

™^G^CGACGCGACCCAACTCCTGCGCGGACTTTCCGGCGCCAGTGACGTGATCC 

aSaa^^gcgaaaacgtgctggacaocctggaacccgaactgcgcgaattcctactggt 

25 GGC^C^GT^ACCGAACGCACGTGCGGCGGGCTGGCCTCGGCGCTGGCC^GGATCACCMTG 

ggcgggcgatgctggaagaggccgagcaccgcggcttgttcctgcaacggaccgaagacgac 

CCG 



20 



CTGCACGAAGCCGTCGACCATGCACTGG 

30 



aattggtttcgcttccaccaaatgttcgccgactttctccaccgtcgcctcgaacgtggcg 
ggtcgcaccgggtggcggaactgcaccgcagggcatcggcctggttcgccgagaacggctac 

CCGCGGGCGATCCCGCGCGCGCCGTCGATCTTGT 
CGAGCAGGATGAAACGAACCTGCCGGAGCAGTCAAAGATGACCACACTTCTGGCAATCGTGCA 
GAAACTG^CGACGTCGATGGTGGTTrCACGGGCCCGGCTCCAACTCGCCATCGCGTGGGCGAA 

Lgcaacgoccggcgccggccaccggtgccctgaatcgtttcgaaacggcccttgg 

CCGGGCCGAGCTTCCCGAGGCGACGCAGGCGGATCTGCGGGCCGAGGCAGACGTGJ^TCG© 

kGACCGGGTCGAGCGCGTGGATGACCTTCTCGCCGAGGCAAT 
GTCGAGACCGGACACCCTGCCCCCGGGAGTCCCCGGGACCGCCGGCAACACCGCGGCGrrGG 

SLgatotgccgcttcgagttcg<xgaggtatatccactgctggactgggccgcgccctacc 



CATTCT 



GCGGTCGCCGAGGTGTTCGCAC 

35 

^^r-r- AT/-T«mnr.TTCGAGTTCGCCGAGGT 

rGCGCAGTGCTTGCGCGGCATGGCGGCCAGGA 



AGGAAATGATGGGACCGTTCGGCACCGTTTATC 

135 



BNSDOCIO: <WO 0135317A1 I > 



PCT/CS00/311S2 

WO 01/35317 

ATCGGCTCGACATTGTCGCTGCGCTACAGAACTTCCGAACGGCOTTCGAGGTCSGCACGGCAG 

^ggStcgcacgcggcgcggcttgcggg^cgctgctcgccgaattgctctacgag 

ACC^GC^ATCTGGCCGGGGCTGGTCGTCTCATGGACGAGAGCTATCTGCTGGGTTCCGAGGG 

GG^^GCAGTGGACTACCTGGCCGCCAGGTACGTGATCGGCGCGCGGGTCAAGGCGGCCCAGG 

GGGATCATCAGGGT6CGGCTGATCGCCTGTCCACCGGAGGCGATACTGCCGTCCAGCTGGGG 

CTGCCGCG^CTGGCTGCCCGAATCAACAACGAGCGGATCCGGCTGGGCATCGCGCTACCTGC 

GG^^CTGG^CGCCGATTTGCTGGCAGCCCGCACCATCCCCCGCGACAATGGAATCGCCAG^T 

GACA^SCCGAACTCGACGAGGACTCCGCGGTGCGCCTGTTGTCCGCCGGCGACTCGGCCGATC 

QTQACCAAGCCTGCCAACGGGCCGGTGCTCTCGCCGCCGCCATCGACGGTACGCGCAGACCG 

CTG^CG^CGCTGCAGGCGCAAATACTrCATATCGAAACGCTTGCCGCCACCGG^GGGAATrcC 

Stgo^gaaacgaactggcgccggtagccacgaagtgcgccgaactcgggctgtcacgtct 
gctggtcgatgcgggactggcctaa 

>Rv3106 fprA adranodoxln and NADPH ferredoxir. reductase TB.seq 3474004:3475371 
MW49342 ,emb|AL123456lMTBH37RV:M74004-3475374. fprA SEQ ID NO:120 
AT^W^CCCTATTACATCGCCATCGTGGGCTr^GGGCCGTCGGr^TTCTTr^CCGC^Mr^VrCC 

^G^TGAAGrSCCGr^GACACGACCGAGGACCTCGACATr^CCGTCGAr^TGCTGGAGATTOTTG 

Cr^3^CTCCCTGGGGGCTGGTGCGCTCCGGGGTCGCGCCGGATCACCCCAAGATCAAGTCGAT 

CAG^AAGCAATTCGAAAAGACGGCCGAGGACCCCCGCTTCCGCTTCTTCGGCAATGTGr^r^T 

CGGC^AACACGTCCAGCCCGGCGAGCTCTCCGAGCGCTACGACGCCGTC^JCT^r^^C^^CG 

GCGCGCAGTCr^ATCGCATGTTGAACATCCCCGGTGAGGACCTGCCGGGCAGTATCGCCGCC 

GTO^Tn^^^^GGTGGTACAACGCACATCCACACTTCGAGCAGGTATCACXlCGATCTGTCGG 

GC^^Cr^GGCCGTAGTTATCGGCAATGGAAACGTCGCGCTAGACGTGGCACGGATTCTGCTCA 

CCG^TC^O^W^TGTTGGCACGCACCGATATCGCCGATCACGCTTTGGAATCGCTACGCCC^C 

™^OAGGAGGTGGTGATCGT^GGCGCCGAGGT<XGCTGCAGGCCGCGTTCACCACG 

y^^^^Q^GCG^GAGCTGGCCGACCTCGACGGGGTTGACGTGGTGATCGATCCGGCGGAGCT 

GGAC^^^A^ACCGACGAGGACGCGGCCGCGGTGGGCAAGGTCTGCAAr^^GAACATCAAG^ 

TGCTGCGTGGCTATrKGGACCGCGAACCCCGCCCGGGACACCGCCGCATGGTGTTCCGGTTCT 

Xr^CCTCT^O^TCGAGATCAAGGGCAAGCGCAAAGTGGAGCGGATCGTGCTGGGC^^^AACG 

AGCTGGTCTCCGArjJGGCAGCGGGCGAGTGGCGGCCAAGGACACCGGCGAGCGC^AG^^^^T 

GCCAGCTCAGCTGGTCGTGCGGTCGGTCGGCTACCGCGGGGTGCCCACGCCCGGGCTGCCGT 

tSao^agagcgggaccatcc^^^^ 

Ic^Sg^g^tcaagcgcgggccgaccggggtgatcgggaccaagaagaaggacgcc 
a^ccgLgatcatgccgaccaggtggccgactggctagcagcacgccagccgaagctg 

GTCACG^^G^^^ACTGGCAGGTGATr^^CGCTTTCr3AG(^GGCCGCCGGr^AGCCGCACGG 
GCGTCCCCGGGTCAAGTTGGCCAGCCTGGCCGAGCTGTTGCGGATTGGGCTCGGCTGA 
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> R »3235 -TB.seq 3611296:3611934 MW:22659 ^^1234561^37^:3611296-3611937. 

^^T^S^GTGCCCCTTGAOCATCAGTCCTATCGCGAACTCACCGGGCGACA 
CC^CG^CGTCAC^C^C^TCGTCGAGTACGAGCCGCCGCCGCGAAACATCCCGCCGTGCGGG 

caatcatSc^Sgcccggcggccgcacaccccgcagctaggtcgccgacaaccaatcagg 

C^G^^AOOGGCAGCGGTCACCTCCACGGCCAAGTCACOGCGGCTGCGTCAAGC 



GGGGACCTTCGCCGATGCCGCGCTAOGGCGAGTGCTGGAGGTCATC.3ACCGCCGCCGCCCGG 
TGGGCCAGCTGCGCC 

ACGGCGGCCGGACAO , GGTCT TCGGCACCTACAGTCGCGGGGACCGGATCC 



GCCCTGCACATCGGGTGA 

15 >Rv3255cn,,nA m annose*phospha te Iso^rase TB.se, 3635040:3636263 MW:43340 

kiai ^^RlMTBH37RV c3636263-3635037. manA SEQ ID NO:122 
^^^^T^GTTACGCACGTACGCTTGGGGATCGCGOACCGCTATCGCCGAA 

^GG^C^TGCCGGOCGCTOACCCOGAGGCCGAACTATGGTTCGGTGGACACGC 



GGATCCGGA 



GGGGCAGCTCGGCTCCGCGTCGCGC 



GGGTGATCCGGCTTGGCTGCAGACGCCGCATGGCCAAACCTCGTTGCTCGAAGCGTTGGTCGC 

GTCG 

:act> 

^.nn-rrrAGGGCTACCTGCGGGAAGAGCGAATGGGCATTCCGGTGTCCTCACCCGTCCGCAAC 
^G^C^ 



20 C3U^i^a.^o^. • "5CGCGCGATTCGGCGATGTGTTGCCGTTCT 

TGGTCAAGGTGTTGGCGGCCGACGAGCCACTATCGTTGCAGGCCOATCCGAGCGCCGAGCAG 



25 



OGATTCCGGGAGGCGGCTCGCACCAGCGAGCTGCTGCGGGCGCTGGCCGTATCCGACCTCGA 

CCCGTTCATC 
CTGGATTACC 

CCAGTACGTG™, CGGC GTTGTTGCTCAACCGCATCAGCTTGGC 



CCCGTTCATCGACTTGGTGAGCGAGGGGTCCGATGCCGATGGTTTGCGTGCGCTGTTCACCAC 

ctSat^accg^accccagcccgacatcgacgtgctggtgcctgccgtgctggac 
OTTCA " A TcagctSggcgoaacggaatttggcgccgaagccaagacaG7Gctggaactcgg 



30 gg^/^^g^togcoaactccgacaacgtgttaggcggtggactta^^c^ta^g^ac^^^v^^ 

gcccgagttgttgcgggtgctggacttcgcccccacgccgaaggc^^ 
tccggcgcgaggggctggggctggtctttgagacgcccaccgatgagttgggg^g^gacgcta 

rTGGTGCTCGACGGCGATCACCTCGGCCACGAGGTCGACGCGTCGTCCGGCCATGACGGTCC 
« ACAGAT^rTGT^^GCACCGAGGGTTCGGCGACGGTGCACGGGAAGTGCGGGTCGCTCACGCT 
" ACAGC^^G^CACGG^GGGCTGGGTGGCGGCCGACGACGGCCCGATCCGGCTGACCGCCGGC 

CAACCCGCCAAGCTGTTCAGGGCGACCGTCGGGTTGTGA 
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HM2M.mM ,.uoo»-H*o,P«.»thy^d^ans ftraM m W 3644S97.36«597 3 MW:37840 
, mb |AL123456|MTBH37RV:c3645973-36«694. nnlA2 SEQ ID NO:123 

^^GC^CTCACCAAGTCGATGCGGTGGTC^GGTCGGTGGCAAGGGTACCCGACTGCGQCCG 
s ^GACGCTGTCGGCGCCCAAGCCAATGCTGCCTACCGCCGGACTGCCGTTCCTCACCCATCTG 
CTGTCGCGGATCGCCGCAGCGGGCATCGAGCACGTGATCCTGGGTACGTCCTACAAACCCGCA 
GTCTTCGAAGCGGAGTTOGGCGACGGGTCCGCACTGGGCCTAGAGATCGAATACGTGACCGAG 
GAGCATCCCTTGGGGACTGGCGGCGGCATCGCCAAGGTTGCCGGCAAGCTGCGCAACGACAC 
CGCGATGGTGTTTAACGGCGATGTGCTCTCGGGCGCGGATCTGGCCCAACTGCTGGACTTCCA 
,0 CCGAAGCAATCGAGCCGATGTCACGCTGCAACTGGTGCGGGTGGGCGACCCGCGGGCATTCG 
10 GCTGCGTA^CAOCGAGGAGGAGGAGCGCGTAGTCGCCTTTCTGG = 

CCGACCGACCAGATCAATGCCGGCTGCTATGTCTTCGAACGCAACGTCATCG/^CGGMTCCGC 
AGGGCCGGGAGGTTTCGGTGGAACGCGAGGTGTTCCCGGCCTTGCTCGCCGACGGCGACTGC 
AAGATCTACGGCTATGTCGATGCCAGCTATTGGCGGGACATGGGCACACCGGAAGACTTCGTTC 
i5 G^GGATCGGCGGATCTGGTGCGCGGCATCGCCCCGTCTCCGGCCTTGCGTGGTCACCGCG^T 
GAGC^GTTGGTGCACGACGGTGCGGCGGTATCTCCCGGTGCGTTGCTGATTGGCGGCACCGTC 
GTGGGGCGTGGTGCCGAAATCGGCCCCGGCACCAGATTGGACGGCGCGGTC^TCTTCGATGG 

totg^gVggaggccgggtgcgtgatcgagcgttcgatcatcggcttcggtgctcgcatcgg 

ACCG^GGG^G^GATCCGCGACGGTGTGATCGGTGACGGGGCCGACATCGGCGCGCGCTGCG 
20 AGTTGTTAAGTGGTGCCCGGGTATGGCCCGGTGTCTTTCTTCCCGACGGCGGGATCCGTTACTC 

GTCCGACGTTTGA 

.RV33680- TB.se, 3780334:3780975 MW:23734 >emb|AL1234 S 6| M TBH37 R V:c378097 5 -3780331. 

26 ATG^CCCTCAACCT^TC^GTCGACGAGGTCCTGACCACTACCCGCTCGGTGCGCAAGCGTCTC^ 
G^JT^CGACAAGCCGGTGCCACGCGACGTGCTGATGGAATGCCTCGAGCTGGCGCTGCAGGCG 
CCCACCGGTTCCAATTCCCAAGGCTGGCAGTGGGTGTTCGTCGAGGACGCCGCCAAGAAAAAG 
G^MGGCCG^CGTCTACCTGGCCAACGCCCGGGGCTACCTCAGCGGGCCGGCG^CCG^^T^ 
CCCCGACGGCGACACCCGCGGCGAGCGGATGGGGCGGGTCCGCGATTCGGCGACCTATCTCG 

30 CCGAACAGATGCACCGGGCGCCGGTGCTGCTGATCCCCTGCCTGAAAGGCCGGGAAGACGAG 
^^G^GTGGCGTGTCGTTTTGGGCCTCAOTGTTCCCGGCGGTGTGGAGGTTCT^CTG 
GCGCTGCGCTCCCGCGGGCTGGGTTCGTGCTGGACGACGCTGCACCTGCTCGACAfl^GGCGA 
GCACAAGGTGGCCGACGTGCTCGGCATTCCCTACGACGAATACAGCCAAGGCGGGCTGCTTCC 
GATCGCCTACACACAAGGCATCGACTTCCGGCCGGCCAAGCGGCTGCCGGCCGAGAGCGTGA 

35 CGCACTGGAACGGCTGGTAA 
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>Rv3382CytB1TB.seq 3796447:3797433 MW:34667 > mb|AL123456|MTBH37RV:c3797433- 
3796444 lytB SEQ ID NO:125 

ATGGCTGAGGTGTTCGTGGGACCGGTCGCACAGGGATACGCTTCGGGTGAAGTCACGGTGCTG 

TTGGCGTCGCCGCGGTCGTTTTGCGCCGGTGTAGAGCGTGCTATCGAGACGGTCAAGCGAGTG 

CTTGACGTGGCCGAAGGCCCGGTGTATGTGCGCAAGCAAATCGTGCACAACACTGTTGTGGTT 

GCCGAGTTGCGGGACCGGGGAGCAGTGTTCGTCGAGGATCTCGACGAGATTCCCGATCCGCC 

GCCGCCGGGGGCGGTCGTGGTGTTCTCCGCGCATGGGGTTTCCCCGGCGGTGCGCGCGGGC 

GCTGATGAGCGGGGACTGCAGGTCGTCGACGCGACCTGCCCACTGGTGGCGAAAGTCCACGC 

TGAAGCCGCACGGTTTGCCGCGCGCGGTGACACGGTGGTCTTCATCGGGCACGCCGGACATG 

AGGAGACCGAAGGCACGCTTGGCGTCGCTCCGCGGTCAACATTATTGGTGCAGACACCCGCTG 

ATGTGGCAGCGTTGAACCTGCCCGAGGGTACCCAGCTATCGTATCTGACCCAGACAACCCTGG 

CACTTGATGAAACTGCCGATGTCATTGATGCGCTGCGCGCGAGGTTTCCGACGTTGGGCCAACC 

CCCCTCTGAAGACATCTGCTATGCCACCACGAACAGACAGCGTGCGCTGCAATCGATGGTCGGT 

GAATGTGACGTTGTGTTGGTGATTGGCTCGTGCAATTCGTCGAATTCGCGGCGTCTGGTCGAGT 

TGGCGCAGCGAAGTGGGACGCCGGCCTACTTGATTGACGGGCCTGATGACATTGAGCCCGAAT 

GGCTGTCGTCGGTCTCGACGATCGGTGTCACCGCGGGAGCCTCCGCGCCGCCACGACTGGTG 

GGGCAGGTGATTGATGCACTTCGCGGATACGCCTCGATCACCGTGGTGGAACGCTCGATAGCG 

ACCGAGACGGTGCGATTCGGCCTTCCCAAACAGGTTCGCGCGCAATGA 

>Rv3418c groES 10 kD chaperone TB.seq 3836985:3837284 MW:10773 
>emb|AL123456|MTBH37RV:c3837284-3836982.groES SEQ ID NO:12B 

QTQGCGAAGGTGAACATCAAGCCACTCGAGGACAAGATTCTCGTGCAGGCCAACGAGGCCGAG 

ACCACGACCGCGTCCGGTCTGGTCATTCCTGACACCGCCAAGGAGAAGCCGCAGGAGGGCAC 

CGTCGTTGCCGTCGGCCCTGGCCGGTGGGACGAGGACGGCGAGAAGCGGATCCCGCTGGACG 

TTGCGGAGGGTGACACCGTCATCTACAGCAAGTACGGCGGCACCGAGATCAAGTACAACGGCG 

AGGAATACCTGATCCTGTCGGCACGCGACGTGCTGGCCGTCGTTTCCAAGTAG 

>Rv3423c air TB.seq 3840193:3841416 MW:43357 

>emb|AL123456|MTBH37RV:c3841416-3840190. air SEQ ID NO:127 _ A ^ AM . 
GTGAAACGGTTCTGGGAGAATGTCGGAAAGCCAAACGACACGACAGATGGGCGGGGCACGACT 

TCGTTGGCCATGACACCGATATCCCAGACACCTGGCCTCCTCGCCGAGGCCATGGTGGATCTG 

GGCGCTATTGAACACAACGTGCGGGTGCTGCGTGAGCACGCCGGCCACGCGCAGCTGATGGC 

GGTGGTCAAGGCCGACGGCTACGGTCACGGTGCTACGCGCGTCGCCCAAACCGCCCTGGGAG 

CCGGTGCGGCCGAACTCGGCGTCGCCACCGTCGACGAGGCGCTAGCGCTGCGCGCTGATGGC 

ATTACCGCACCGGTGCTGGCCTGGCTGCATCCGCCCGGCATCGACTTCGGGCCCGCGCTGCTG 

GCCGACGTGCAGGTCGCGGTGTCCTCGCTGCGCCAACTCGACGAACTGTTGCACGCGGTGCG 

CCGGACCGGCCGGACGGCGACGGTGACCGTCAAGGTGGATACCGGGCTGAACCGCAATGGCG 
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TGGGACCGGCACAATTCCCGGCCATGCTGACCGCGTTACGCCAAGCCATGGCCGAGGACGCC 
GTCCGGCTGCGGGGGCTGATGTCGCATATGGTTTACGCCGACAAGCCTGACGATTCCATCAAC 
GATGTTCAGGCCCAACGGTTTACCGCCTTTCTGGCGCAGGCCCGCGAACAAGGGGTGCGGTTC 
GAGGTGGCGCATCTATCGAACTCATCAGCAACTATGGCGCGCCCCGACCTGACGTTCGACCTG 

5 GTGCGGCCGGGCATCGCGGTGTATGGGCTAAGCCCGGTACCCGCCCTCGGTGACATGGGGCT 
GGTGCCGGCGATGACCGTGAAATGTGCTGTTGCGCTGGTGAAATCGATTCGTGCGGGGGAGGG 
CGTGTCGTATGGGCACACATGGATCGCGCCACGCGACACCAATCTGGCGCTGCTGCCGATCGG 
TTACGCAGACGGCGTGTTCCGGTCGCTGGGCGGGCGGCTGGAGGTGGTGATCAACGGCAGAC 
GATGCCCCGGTGTGGGGCGGATCTGCATGGACCAGTTCATGGTCGACCTGGGCCCCGGGCCG 

10 CTTGATGTGGCCGAAGGCGACGAGGCGATTTTGTTCGGGCCGGGCATCCGGGGTGAGCCCAC 
GGCTCAGGACTGGGCCGATCTTGTCGGCACCATCCACTACGAAGTGGTCACCAGCCCGCGAGG 
ACGTATCACCAGGACCTATCGCGAGGCTGAAAACCGTTGA 

>Rv3490 otsA lalphaj.-trehalose-phosphate synthase TB.seq 3908232:3909731 MW:55864 
15 >emb|AL123456|MTBH37RV:3908232-3909734.otsA SEQ ID NO:128 

ATGGCTCCCTCGGGAGGCCAGGAGGCGCAGATTTGCGATTCGGAGACCTTCGGGGACTCTGAC 
TTCGTGGTGGTAGCCAATCGACTGCCCGTCGATCTGGAGCGTCTTCCCGACGGCAGCACAACC 
TGGAAACGCAGCCCCGGAGGCTTGGTCACCGCCTTGGAGCCGGTGCTGCGGCGTCGGCGCGG 
GGCCTGGGTCGGCTGGCCCGGCGTTAACGACGACGGGGCCGAACCCGACCTCCACGTGCTGG 
20 ACGGCCCCATCATCCAAGACGAGCTGGAACTTCATCCGGTACGGCTGAGCACCACGGACATAG 
CTCAGTACTACGAGGGATTCTCCAACGCCACACTGTGGCCGCTGTACCACGACGTCATCGTCAA 
GCCGCTCTACCACCGCGAATGGTGGGATCGCTACGTCGACGTCAACCAGCGCTTTGCCGAGGC 
CGCGTCGCGCGCCGCCGCCCACGGCGCAACCGTGTGGGTACAGGACTACCAGCTGCAGCTGG 
TACCGAAGATGCTGCGCATGCTGCGGCCCGATCTGACCATCGGTTTCTTTTTGCACATCCCGTT 
25 CCCGCCGGTAGAGCTGTTTATGCAGATGCCGTGGCGCACCGAGATCATCCAGGGCCTACTGGG 
CGCCGACCTGGTGGGCTTCCATCTTCCGGGCGGTGCCCAGAATTTCCTGATCCTGTCCCGGCG 
TCTGGTCGGCACCGACACTTCCCGCGGAACCGTCGGTGTGCGGTCGCGGTTCGGTGCGGCGG 
TGCTCGGGTCCCGCACCATACGAGTTGGCGCCTTTCCTATCTCGGTTGACTCCGGCGCGCTCG 
ACCACGCTGCCCGCGACCGCAACATCAGGCGCCGGGCCCGCGAGATTCGCACCGAACTGGGA 
30 AATCCGCGCAAGATCCTGCTCGGTGTTGACCGGCTCGACTACACCAAGGGCATCGACGTACGG 
CTGAAGGCCTTTTCCGAGCTGCTGGCCGAGGGCCGCGTCAAACGCGACGACACCGTCGTGGTC 
CAGCTGGCTACCCCGAGCCGCGAGCGGGTGGAGAGCTACCAGACGCTGCGCAACGACATCGA 
ACGCCAGGTCGGCCACATTAACGGCGAGTACGGTGAGGTTGGCCATCCGGTAGTGCATTACCT 
GCATCGACCGGCTCCGCGCGACGAGCTTATCGCTTTCTTCGTGGCCAGCGACGTCATGCTGGT 
35 CACCCCACTACGCGACGGGATGAACCTGGTGGCCAAGGAGTACGTCGCTTGCCGCAGCGATCT 
TGGCGGTGCCCTGGTGCTCAGCGAATTCACCGGGGCCGCAGCCGAACTCCGGCACGCATACCT 
GGTCAACCCGCACGACCTGGAAGGCGTCAAGGACGGGATAGAGGAAGCGCTCAACCAGACGG 
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AGGAGGCGGGCCGGCGGCGAATGCGGTCGC 
CCGCTGGGCACAGTCGTTTCTCGACGCTCT 



TGCGACGCCAAGTGCTCGCCCACGACGTGGA 
CGCCGGGGCACACCCGAGGGGCCAAGGCTAA 



GCAGTTCCGGATTCGCCGGGACAAGCGC 



>R»3598C lysS lysyMRNA Sybase TB.seq 4041423:4042937 MW:55678 
>en,b|AL123456|MTBH37RV:c4042937-t041420. lysS SEQ ID NO:129 
GTGAGTGCCGCTGACACAGCAGAAGACCTTCCTGAGCAGTTCCGGA 
QCTCGCTTGCTGGCCCAGGGGCGCGATCCCTATCCCGTCGCGGTGCCGCGCACTCACAGSTTG 

G^C^AGG^CGCGCCGCCCACCCTGAGTTGCCGATCGATACCGCGACCGAAGACATCGTCGGC 

GTCGCGGGCCGAGTGATCTTTGCGCGCAACTCGGGAAAGCTATGCTTTGCGACACTTCAGGAC 

GG^GATGGTA^^GCTGCAAGTGATGATCAGCCTCGACAAGGTCGGCCAGGCTGCTCTCGAC 

^t^gSgatgtcgacctgg^^ 

CGC^GC^G^^GCTGTCCGTCCTGGCGGATTGCTGGCGGATCGCCGCCAAGTCGCTGCGGCC 

o^ccStcgcgcac^ 
catIgtVcgaccggaagcgcgcg^^^ 

gg/^^gcgc^caacgtcgtgggttcctggaagtcgagacgcccgtcttgcagacgttagccg 
gtgg^^^tccgttcgccactca^ccaatgcgctagacatcgatct^a^ 
^cgccggaactgwcctcaagcggtgcatcgtgggtggtttc^acaaggtcttcgaactt 

^ftjCGAG^G^^CGAAACGAAGGAGCCGATTCCACGCATTCTCCGGAATTCTCCATG^TGGA^V 

c^a^agacctacggaacctatgacgattcggcagtcgtcacccgggagcttattcaagaggt 

SLc^GGCGATCGGAACCAGACAAOTGCCGTTGCCCGACGGCAGTGTCTATGACATCGA 

ACCGCAGACGACGGTCGATCGCTTACGTGGGATCGC^GATAGCCTTGGCCTGGAGAAAGACCC 
AG^A^CAK3ACAACCGTGGCrTCGGCCACGGCAAACTCATCGAGGAACTCTGGGAGCGCAC 
^^«QQ^^^^^^rre^CGCACCCACATTTGTCAAGGATTTTCCGGTTCAGACAACGCCTTTO 

GAAC^GC^^GGCTACTCGGAATTAAGCGACCCGGTAGTCCAGCGGGAGAGATTCGCCGAC 
C^GCCC^TGCCGCGGCCGCTGGCGATGACGAAGCGATGGTGCTTGACGAGGATTTTCTGGCC 
GCTCTGGAGTACGGCATGCCACCGTGCACCGGAACCGGAATGGGTATCGATCGGTTGTTGATG 
TC"TTn^^^rGGGTTGTCAATTAGGGAGACAGTTTTGTTCCCGATTGTTCGACCACACTCCAACTG 

A 

>RV3600C - simnarK, Stilus sub«s prcteinYacBTB.se, 4043041:4043356 MW:29274 
>en,blAL123456IMTBH37RV:c4043856-4043038.Rv3600c SEQ ID NO:130 
(STOCT^CTGGC^ATTGACGTCCGCAACACCCACACCGTTGTGGGCCTGCT 

Lgcacgcaaaggtcgtgcagcagtggcggatacgcaccgaatccgaagtcaccgccgacgaa 

CTGGCACTGACGATCGATOGGCTGATCGGCGAGGATTCCGAGCGGCTCACCGGTACCGCCGC 
CTTGT^CGGTCCCGTCCGTGCTGGACGAGGTGCGGATAATGCTOGACCAGTACT^^ 
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GGTGCCGCACGTGCTGATCGAGCCCGGAGTACGCACCGGGATCCCTTTGCTCGTCGACAACCC 
GAAGGAAGTGGGCGCAGACCGCATCGTGAACTGTTTGGCCGCCTATGACCGGTTCCGGAAGGC 
CGCCATCGTCGTTGACTTTGGATCCTCGATCTGTGTTGATGTTGTATCGGCCAAGGGTGAATTTC 
TTGGCGGCGCCATCGCGCCCGGGGTGCAGGTGTCTTCCGATGCCGCGGCGGCCCGCTCGGCG 
5 GCATTGCGCCGCGTTGAACTTGCCCGCCCACGTTCGGTGGTTGGCAAGAACACCGTCGAATGC 
ATGCAAGCCGGTGCGGTGTTCGGCTTCGCCGGGCTGGTAGACGGGTTGGTAGGCCGCATCCG 
CGAGGACGTGTCCGGTTTCTCCGTCGACCACGATGTCGCGATCGTGGCTACCGGGCATACCGC 
GCCCCTGCTGCTGCCGGAATTGCACACCGTCGACCATTACGACCAGCACCTGACCTTGCAGGG 
TCTGCGGCTGGTGTTCGAGCGTAACCTCGAAGTCCAGCGCGGCCGGCTCAAGACGGCGCGCT 

10 GA 

>Rv3606c folK 7,8Klihydro-6-hydroxymethylpterin pyrophosphokinase TB.seq 
4048181:4048744 MW:20732 >emb|AL123456|MTBH37RV:c4048744-4048178. folK 
SEQ ID NO:131 



15 



20 



25 



30 



35 



SEQ ID NO:131 

ATGACGCGGGTAGTGCTCTCGGTTGGCTCCAACCTGGGTGACCGCCTGGCACGATTGCGGTCG 
GTCGCCGACGGTCTCGGCGATGCGTTGATTGCGGCTTCCCCGATATATGAGGCCGACCCCTGG 
GGTGGGGTGGAGCAGGGGCAGTTCCTCAATGCGGTGCTGATCGCCGACGATCCTACCTGCGAA 
CCGCGGGAGTGGCTGCGGCGGGCGCAGGAGTTCGAGCGCGCTGCGGGCAGGGTGCGTGGCC 
AGCGCTGGGGTCCACGAAATCTCGACGTCGACCTGATCX3CCTGCTACCAGACCTCGGCCACCG 
AGGCTCTGGTCGAAGTGACCGCGCGGGAGAACCACCTCACGCTGCCGCACCCACTGGCGCAT 
CTGCGGGCCTTTGTGTTGATCCCGTGGATTGCCGTCGACCCAACGGCGCAGCTGACGGTTGCC 
GGGTGCCCGCGGCCCGTCACGCGACTGCTGGCCGAGCTGGAGCCCGCCGACCGCGACAGTGT 
GCGGTTGTTTAGGCCGTCGTTCGATCTGAATAGCAGACACCCCGTCAGTCGGGCACCGGAAAG 

CTGA 

>Rv3607c foIX may be involved in folate biosynthesis TB.seq 4048744:4049142 MW:14553 
>emb|AL123456|MTBH37RV:c4049142-4048741. foIX SEQ ID NO:132 

ATGGCTGACCGAATCGAACTGCGCGGCCTGACCGTGCATGGTCGGCACGGGGTCTACGACCAC 

GAGCGAGTGGCCGGGCAGCGGTTTGTCATCGATGTCACCGTGTGGATAGACCTGGCCGAGGC 

CGCCAACAGCGACGACTTGGCCGACACCTATGACTACGTGCGGCTGGCTTCGCGGGCGGCCG 

AGATCGTCGCCGGACCCCCGCGGAAGCTGATCGAAACGGTCGGGGCCGAGATCGCTGATCAC 

GTGATGGACGACCAGCGAGTGCATGCCGTTGAGGTGGCGGTACACAAGCCGCAGGCGCCCATT 

CCGCAGACGTTCGACGATGTGGCGGTGGTGATCCGACGCTCACGGCGCGGCGGCCGCGGTTG 

GGTAGTCCCGGCGGGCGGCGCGGTATGA 

>Rv3608c folP dihydropteroate synthase TB.seq 4049138:4049977 MW:28812 
>emb|AL123456|MTBH37RV:c4049977-4049135. folP SEQ ID NO:133 
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5 



CATCGATACCATGCGCGCGGATGTCGCTCGGGO 



CGACGTGTCGGGTGGGCGGGCCGATCCGGCG 
CCGTGGGTGTTGATGCACTGGCGGGCGGTATCG 1 



ATGGGGCCGCTGTTGGCCGAGGCCGATGTG 
GCCGATACCCCGCATGTGCCTGTGCGCTAC 



GGGAGCGGAAAGGATAGAACGCGATGGCTGA 
" > R v3609Co,E GTPcyCehydro.ese.TB.S*, 4049977:4050582 MW.22395 

M QTTCG^AG^^^^^^^GTCACCACTACGTCGGCGGTGCGCGGACTGTTCAAAACCAATGCC 
GCTTCTCG AGCCGAAGCGCTCGACCTCATTTTGCGGAAGTGA 

„ m .»i„ ehaoerone TB.seq 4050601 :4052880 MW:81987 
>Rv3610c«sH inner memBrane protein, cnaperone io.»=n 

^nsiiiMTBH37RVo40S2880-4050598. fBH SEQ ID NO:135 

t^gttcttttC!ctt^^cgacgacacccgcggctacaagcccgttgatacctcggtggcgataa 

^^A^^TCAA^SGCGACAACGTCAAGAGCGCACAGATCGACGATCGCGAGCAACAGCTGCG^^ 

to^^gaagggtaac^^^ 

^rn^^^^CC^TCGACCTGTTCAACGCGCTCAGCGCCAAAAACGCGAAGGTCAGCACGGTCG 
35 ^GGGTACGC^TCGACCTGTT GTCTACGTGC TGCCGCTGCTGTTGCTGGTGG 
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CACGCGCCAAGCAACTGAGCAAGGACATGCCCAAGACCACCTTCGCCGACGTCGCAGGTGTCG 

A^GAGGCGGTCGAGGAGCTCTACGAGATCAAGGACTTCCTGCAGAACCCCAGCAGGTACCAAG 

C^CTGGGCGCCAAGATCCCCAAAGGCGTGCTGCTCTACGGGCCGCCGGGAACCGGTAAGACG 

^GCTG^TCGTGCGGTGGCCGGCGAAGCCGGAGTGCCGTTCTTCACCATCTCCGGCTCCGAC 

^CGTCGAAATGTTCGTCGGCGTCGGCGCATCCCGTGTCAGAGACCTGTTCGAGCAGGCCAAG 

CA^A^^G^^GTGCATCATCTTCGTCGACGAGATCGACGCCGTCGGCCGACAAAGAGGCGCC 

G^GCTOGGCGGCGGTCACGACGAGCGTGAGCAGACCCTCAACCAGTTGCTAGTCGAAATGGA 

cgg™^atcgcgccggcgtcatcctgatgggggccaccaaccggcccgaoatcctgga 

CCCGGCG^raTTGCGGCCGGGCCGCTTCGACCGCCAGATCCCGGTATCCAACCCCGATCTGG 

^ggLggcggtgctgcgcgtgcactccaagggcaagccgatggccgcggac^c^ 

CC^^ACGGACTGGCCAAGCGGACCGTCGGCATGACCGGAGCCGACCTGGCCAACGTCATCA 

aSaggTggcgctgct^^ 
o^Sgtggaccgggtgatgggcggcccgcgcc^ 

^gaagatcaccgoctatoacgagggcgggcacacgctggccgcttgggcgatgccc^ 
^g^atttataaggtgacgatcctggcgcgcgggcgtaccggcgggcacgcggtggcgg 

TGCOGGA^GWvGACAAGGGCCTGCGGAGCCGCTCGGAAATGATCGCGCAACTGGTGTTCGCGA 

Itcgagcaggccaccaagatagcgcgctcaatggtcaccgaatwggaatgagctcgaagctg 

^™TCAAATACGGCTCCGAACACGGCGACCCGTT(XTCGGACGTAa^TGGGCACCCAG 
^^q^^^CWJACGAGGTGGCCCGCGAGATCGACGAAQAGGTCCGCAAGCTTATC^^^O 
^ATACCGAAGCGTGGGAAATCCTGACCGAATACCGCGACGTGCTGGACACTTTGGCCGGC 
GAGCTG^T^GAAAAGGAGACCCTGCACCGACCCGAGCTGGAAAGCATCTrCGCTGACGTCGAA 

«Sggcc^ggctcaccatgttcgacgacitcgg^^ 

ATGAW^CACCCGGCGAGCTCGCGATCGAACGCGGCGAACCTTGGCCCCAGCCGGTCCCCGA 
G^CG^CGTTCAAGGCGGCGATTGCGCAGGCTACCCAAGCCGCTGAGGCCGCCCGGTCCGACG 

"g^CCGGGCACGGCGCCAACGGTTCGGCCGC^ 

CAGTACGGCT^ACCCAGCCTGACTACGGTGCCCCGGCGGGCTGGCATGCGCCGGGATGG^ 
CCCAAGGTCATCTCATCGGCCCAGCTATAGCGGTGAACCGGCACCGACGTATCCGGGTCAGCC 
C^CC^ACCG^TCAAGCCGATCCGGGTTCCGATGAGTCCTCGGCGGAGCAGGATGACGAGGT 

CAGTCGGACCAAGCCGGCCCACGGCTGA 

>Rv3671c-TB.seq4112322:4113512MW:40722 >emb|AL123456|MTBH37RV:c4113512^1 12319. 
ATG^CCCCGTCG^ 

GCTGGCGTGCCGGTGCGCTGGGCTCAATGCTGTCGTTTGGCGGGGTGCTGCTGGGCGCGACA 
GCCGGCGTGCTGCTGGCGCCGCATATCGTCAGTCAAATCAGCGCTCCGCGGGCCAAACTGTTT 
GCCGCGCTGTTCCTGATCCTGGCACTGGTCGTAGTCGGCGAGGTCGCTGGTGTGGTGCTGGGC 
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CGCGCCGTCCGCGGGGCGATCCGTAACCGGCCGATCCGGTTGATCGACTCGGTCATTGGGGTA 

GGGGTGCAGCTGGTCGTGGTGCTCACCGCGGCGTGGTTGTTGGCGATGCCGCTGACACAGTC 

GAAAGAGCAGCCCGAGCTGGCTGCCGCGGTGAAGGGTTCGCGGGTGCTCGCCCGGGTCAACG 

AGGCGGCACCCACCTGGCTGAAGACGGTGCCCAAGCGGCTGTCGGCCCTGCTGAACACCTCC 

GGCCTGCCCGCGGTTTTGGAGCCGTTCAGCCGCACGCCGGTCATTCCAGTGGCCTCACCCGAC 

CCAGCGCTGGTCAACAATCCGGTGGTGGCGGCCACCGAGCCAAGTGTCGTCAAAATCCGCAGC 

CTGGCACCCAGATGCCAGAAAGTGTTGGAGGGCACCGGCTTCGTGATCTCACCCGATCGGGTG 

ATGACCAACGCGCACGTGGTGGCCGGATCCAACAACGTCACGGTGTATGCCGGCGACAAGCCC 

TTCGAGGCCACGGTGGTGTCCTACGACCCGTCGGTCGACGTAGCGATCCTGGCCGTTCCGCAC 

TTGCCGCCGCCGCCGCTGGTCTTCGCTGCGGAGCCGGCGAAAACCGGTGCCGACGTTGTGGT 

GCTGGGTTATCCCGGCGGCGGCAATTTCACTGCCACACCCGCCAGGATTCGCGAGGCCATCAG 

ACTCAGTGGCCCCGATATTTACGGGGACCCGGAGCCGGTTACCCGCGACGTGTACACCATCAG 

AGCCGATGTGGAGCAAGGTGATTCGGGTGGGCCCCTGATCGACCTCAACGGTCAGGTGCTCGG 

TGTGGTGTTCGGCGCAGCCATCGACGACGCCGAAACTGGGTTTGTGCTGACGGCCGGCGAGGT 

GGCGGGGCAGCTTGCCAAAATCGGTGCTACCCAACCGGTCGGCACCGGGGCCTGCGTCAGCT 

GA 

>Rv3682 ponA2TB.seq 4121913:4124342 MW:84637 
>emb|AL123456|MTBH37RV:4121913-4124345. ponA' SEQ ID NO:137 

ATGCCCGAGCGCCTCCCGGCCGCGATCACCGTTCTGAAGCTGGCTGGGTGCTGTCTGTTGGCC 

AGTGTCGTCGCCACTGCGCTGACGTTCCCGTTCGCAGGCGGGCTAGGGCTGATGTCCAATCGT 

GCCTCTGAGGTCGTTGCCAACGGCTCGGCCCAGCTGCTCGAGGGGCAAGTGCCTGCGGTATCG 

ACGATGGTCGACGCGAAGGGCAACACGATCGCGTGGCTGTACTCGCAGCGCCGGTTCGAGGT 

GCCCTCGGACAAGATCGCCAACACGATGAAGCTGGCGATCGTCTCGATTGAAGATAAGCGGTTC 

GCCGACCACAGCGGCGTGGACTGGAAGGGCACCCTGACCGGCCTGGCGGGCTACGCGTCCG 

GCGACCTCGACACGCGCGGCGGCTCGACGCTCGAACAACAGTACGTGAAGAACTACCAACTGC 

TGGTGACAGCCCAAACCGATGCCGAGAAGCGAGCGGCCGTCGAAACCACTCCGGCCCGCAAG 

CTTCGCGAGATCCGGATGGCACTCACGCTGGACAAGACCTTCACAAAATCTGAAATCCTGACCC 

GATACTTGAACCTGGTCTCGTTCGGCAATAACTCGTTCGGCGTGCAGGACGCGGCGCAAACGTA 

CTTCGGCATCAACGCGTCCGACCTGAATTGGCAGCAAGCGGCGCTGCTGGCCGGCATGGTGCA 

ATCGACCAGCACGCTCAACCCGTACACCAACCCCGACGGCGCGCTGGCCCGGCGGAACGTGG 

TCCTCGACACCATGATCGAGAACCTTCCCGGGGAGGCGGAGGCGTTGCGTGCCGCCAAGGCC 

GAGCCGCTGGGGGTACTGCCGCAGCCCAATGAGTTGCCGCGCGGCTGCATCGCGGCCGGCGA 

CCGCGCATTCTTCTGCGACTACGTCCAGGAGTACCTGTCTCGGGCCGGGATCAGCAAGGAGCA 

GGTCGCCACGGGCGGGTACCTGATCCGCACCACCCTGGACCCAGAGGTGCAGGCACCGGTCA 

AGGCCGCCATCGACAAGTACGCCAGCCCGAACCTGGCCGGTATTTCCAGCGTGATGAGCGTGA 

TCAAACCGGGTAAGGATGCGCACAAGGTGTTGGCCATGGCCAGTAACCGCAAATACGGGCTGG 

145 



BNSOOCIO: <WO 0135317A1 I > 



PCT/USOO/31152 

WO 01/35317 

ATCTAGAAGCCGGCGAAACCATGCGGCCGCAGCCATTCTCCCTGGTTGGCGACGGCGCCGGGT 

CTATCTTCAAGATCTTCACCACGGCCGCTGCTCTGGACATGGGCATGGGTATTAACGCCCAACT 

CGACGTGCCGCCCCGATTCCAGGCCAAAGGTCTGGGAAGTGGCGGGGCAAAGGGGTGCCCCA 

AAGAGACCTGGTGTGTGGTGAACGCCGGCAACTACCGCGGCTCGATGAATGTCACCGACGCGC 

TGGCAACCTCGCCAAACACCGCGTTCGCCAAGCTGATCTCGCAGGTCGGGGTGGGGCGTGCG 

GTCGATATGGCCATCAAACTCGGGCTGAGGTCTTATGCGAATCCCGGCACCGCACGCGACTAC 

AACCCCGACAGCAATGAGAGCTTGGCTGACTTCGTCAAACGACAGAACCTGGGTTCGTTCACCC 

TCGGCCCCATCGAGTTAAACGCGCTGGAGCTGTCCAACGTGGCGGCCACGTTGGCATCCGGCG 

GCGTGTGGTGCCCCCCCAACCCAATCGACCAGCTCATCGACCGCAACGGCAACGAAGTCGCGG 

TCACCACCGAGACGTGCGACCAGGTGGTGCCCGCAGGGCTGGCGAACACCCTCGCCAACGCG 

ATGAGCAAGGACGCCGTGGGCAGCGGCACGGCGGCCGGTTCGGCCGGCGCGGCGGGCTGGG 

ATCTGCCGATGTCCGGCAAAACCGGCACCACCGAGGCGCACCGGTCGGCCGGCTTCGTGGGC 

TTCACCAACCGCTACGCGGCGGCGAACTACATCTACGACGACTCCAGCTCGCCGACAGATCTGT 

GTTCCGGCCCGCTGCGCCATTGCGGCAGCGGCGACTTGTACGGCGGCAACGAGCCATCCCGC 

ACCTGGTTCGCCGCGATGAAGCCGATCGCCAACAACTTCGGCGAAGTGCAGCTACCACCGACC 

GATCCACGCTATGTCGACGGCGCACCAGGCTCACGGGTACCAAGCGTGGCCGGTCTGGATGTC 

GACGCCGCACGCCAGCGCCTCAAGGACGCGGGCTTCCAGGTCGCCGACCAAACCAACTCGGT 

CAACAGCTCCGCCAAGTATGGTGAGGTGGTCGGAACGTCGCCCAGCGGTCAAACAATTCCGGG 

TTCGATCGTCACGATCCAGATCAGCAACGGCATCCCGCCGGCTCCGCCTCCGCCACCGCTGCC 

TGAGGATGGTGGGCCGCCACCGCCGGTCGGATCGCAGGTGGTGGAGATTCCGGGGCTGCCGC 

CGATCACCATTCCGCTGCTGGCGCCACCACCCCCAGCGCCTCCCCCGTAG 

>Rv3721c dnaZX DNA polymerase l.l.feamma] (dnaZ) and t (dnaX) TB.seq 4164995:4166728 
MW61892 >emb|AL123456|MTBH37RV:c416672^4164992,dnaZX SEQ ID NO:138 
GTGGCTCTCTACCGCAAGTACCGACCGGCAAGCTTCGCGGAGGTGGTGGGGCAGGAGCACGT 

CACCGCGCCGCTGTCGGTGGCGCTGGATGCCGGCCGGATCAACCACGCGTACCTGTTCTCTGG 

GCCGCGTGGCTGCGGAAAGACGTCGTCAGCGCGTATCCTGGCGCGGTCGTTGAACTGTGCGCA 

GGGCCCTACCGCCAACCCGTGCGGGGTCTGCGAATCCTGCGTTTCGTTGGCGCCCAACGCCCC 

CGGCAGCATCGACGTGGTAGAGCTGGATGCCGCCAGCCACGGCGGCGTGGACGACACCCGCG 

AGCTGCGGGACCGCGCGTTCTATGCGCCGGTCCAGTCACGGTACCGGGTATTTATCGTCGACG 

AGGCGCACATGGTGACCACCGCGGGATTCAACGCGCTGCTCAAGATCGTGGAGGAACCGCCC 

GAACACCTGATCTTCATATTCGCCACCACCGAACCGGAGAAGGTACTGCCGACGATTCGGTCGC 

GCACTCATCACTACCCGTTCCGGCTGCTGCCGCCGCGCACTATGCGGGCGTTGCTCGCGCGGA 

TCTGCGAGCAGGAGGGCGTCGTCGTCGACGATGCGGTGTACCCGTTGGTGATCCGGGCCGGC 

GGAGGTTCCCCACGGGATACGCTCTCGGTGCTGGACCAATTGCTGGCTGGGGCCGCGGACAC 

CCACGTGACCTACACCCGGGCGCTGGGGCTGCTGGGTGTCACCGACGTCGCCCTGATCGACG 

ACGCGGTCGACGCACTGGCCGCTTGCGATGCGGCCGCATTGTTCGGGGCGATCGAATCGGTGA 
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TCGATGGCGGACATGACCCTCGGCGTTTCGCTACCGATCTGCTGGAGCGATTCCGCGACCTGA 

TTGTGCTGCAATCGGTTCCCGACGCGGCATCTCGCGGGGTGGTGGATGCGCCCGAAGACGCG 

CTGGATCGGATGCGCGAGCAAGCCGCCCGGATCGGGCGGGCGACCCTGACCCGATATGCCGA 

GGTGGTGCAGGCCGGGCTAGGCGAGATGCGCGGTGCGACCGCGCCGCGTCTGCTGCTGGAA 

GTGGTTTGCGCGCGACTGCTGCTGCCCTCGGCGAGCGACGCCGAATCGGCACTGTTGCAGCG 

GGTCGAACGGATCGAGACCCGGTTGGACATGTCGATCCCGGCGCCGCAAGCCGTACCACGCC 

CGTCGGCTGCGGCTGCCGAGCCGAAACACCAGCCCGCGCGTGAACCGAGACCGGTGCTGGCC 

CCCACACCGGCCTCGAGCGAACCCACCGTGGCCGCGGTTCGGTCCATGTGGCCGACGGTGCG 

CGACAAGGTGCGCCTGCGCAGCCGTACCACCGAGGTGATGCTGGCGGGTGCCACCGTCCGTG 

CGCTAGAGGACAACACGCTGGTGCTGACCCACGAATCGGCGCCGCTGGCGCGGCGGCTGTCC 

GAACAGCGCAACGCCGATGTCCTCGCCGAGGCGCTTAAAGACGCGCTGGGAGTCAACTGGCG 

GGTGCGGTGTGAGACCGGTGAACCGGCTGCGGCGGCATCACCCGTCGGCGGGGGAGCGAAC 

GTGGCGACCGCCAAGGCCGTAAACCCTGCCCCCACAGCGAATTCCACTCAGCGCGACGAAGAG 

GAGCACATGCTCGCCGAAGCCGGCCGTGGCGACCCGTCGCCGCGTCGCGACCCGGAAGAGGT 

TGCACTCGAGCTGCTGCAGAACGAGCTGGGCGCGCGCCGGATAGACAACGCCTAG 

>Rv3783 - TB.seq 4229255:4230094 MW:32337 

>emb|AL123456|MTBH37RV:4229255-4230097 t Rv3783 SEQ ID NO: 139 

ATGACATTCATGGATGCTCAAGCTAGCTTCCAGACACAGTCGCGGACACTGGCCCGCGTCCGA 

GGCGATCTGGTCGACGGGTTCCGCCGCCACGAGCTGTGGCTGCACCTGGGCTGGCAGGACAT 

CAAGCAGCGGTACCGCCGCTCGGTGCTGGGGCCGTTCTGGATCACCATCGCCACCGGAACGA 

CCGCCGTCGCGATGGGCGGCCTGTATTCCAAGCTGTTTCGGCTCGAGCTGTCTGAGCACCTGC 

CCTACGTCACGCTCGGGCTGATCGTCTGGAACCTGATCAACGCCGCCATCCTGGACGGCGCAG 

AGGTTTTCGTCGCCAACGAAGGTCTGATCAAACAGCTGCCGGCACCGTTGAGCGTGCACGTCTA 

TCGGTTGGTGTGGCGGCAGATGATCTTCTTCGCCCACAACATCGTCATCTACTTCGTCATCGCG 

ATCATCTTTCCTAAGCCGTGGTCGTGGGCGGATCTGTCGTTTCTTCCGGCGCTGGCGCTCATTT 

TCCTCAATTGCGTTTGGGTGTCACTGTGTTTCGGCATCCTGGCGACCCGCTACCGCGACATCGG 

CCCGCTGCTGTTTTCCGTTGTGCAGTTGTTGTTCTTCATGACGCCGATCATCTGGAACGACGAGA 

CCCTGCGTCGGCAGGGCGCGGGCCGCTGGTCGAGCATCGTCGAGCTCAACCCGCTGCTGCAC 

TATCTGGACATCGTGCGGGCGCCACTGTTGGGCGCTCACCAGGAGCTGCGGCACTGGCTGGTG 

GTGCTGGTGTTGACCGTCGTCGGCTGGATGCTGGCGGCGTTCGCGATGCGGCAGTATCGCGC 

GCGGGTGCCCTACTGGGTGTAG 

>Rv3789 - TB.seq 4235371:4235733 MW:1 3378 

>emb|AL123456|MTBH37RV:4235371-4235736. Rv3789 SEQ ID NOM40 

ATGCGGTTCGTTGTCACCGGCGGCCTCGCTGGGATAGTTGACTTTGGCCTCTACGTCGTGCTGT 
ACAAGGTGGCGGGCCTACAGGTCGACCTGTCCAAGGCCATCAGCTTCATCGTCGGCACCATCA 
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CCGCGTACCTGATCAACCGCCGGTGGACATTO 
CGGTCATGCTCCTCTACGGAATCACCTTCGCCG 



CAGGCCGAGCCCAGCACGGCCCGATTCGTCG 
iTGCAGGTCGGACTCAACCACCTCTGCCTCGC 



ACTCTTGCACTACCGGGCGTGGGCCATCCCCGTCGCGTTTGTGATCGCGCAGGGCACCGCCAC 
GGTAATCAACTTCATCGTGCAGCGAGCCGTGATCTTCCGGATCCGCTGA 

>Rv3790 - TB.seq 4235776:4237158 MW:50164 

S^^^TCTGCTTCGCACCCCAGATGCCGA^TGATOGTCAAGGCGGTGGCTCGGGT 
10 CGC^^^^CGGGGGGCGGCCGGGGTGCTATCGCGCGCGGGCTGGGCCGCTCCTATGGGGAC 
AAOG^C^AAAACGGCGGTGGGTTGGTGATCGACATGACGCCGCTGAACACTATCCACTCCATTG 
ACGCCGACACC^^CTGGTCGACATCGACGCCGGGGTCAACCTCGACCAACTGATGAAAGCCG 
^^TGC^^CGGGCTGTGGGTCCCGGTGCTGCCGGGAACCCGGCAGGTCACCGTCGGTOGG 
^S^GCGATATCCACGGCAAGAACCATCACAGCGCTGGCAGCTTCGGTAACCACGTG 

Sag^^gctgaccgccgacggcgagatccgtcatctca CT ccgaccggcgagga 
cat^s^^atgacgcccacttcgacggcgtacttcatcgccgacggcgacgtcaccgccagcct 

^^^TCGCCCTGCACAGC^CG^AGCGAAGCGCGCTACACCTATTCCAGTGCOTG 
ft^CGACGCGATCAGCGCTCCCCCGAAGCTGGGCCGCGCGGCGGTATCGCGTGGCCGCCTGG 
Q " C ^^^GCGAAACTGCGGAGCGAACC^TGAAATTCGATGCGCCACAGCT 

ACTTACGT, v , , TCGCGGC AAGGTCCAGAACCTCACGCAGTrCTACC 



AGTTCGTGATCCCCAI 



CAGAGGCGGTTGATGAGTTCAAGAAGATCATCGGCGTTATTCAAGCCTC 
rTTTCTCAACGTGTTCAAGCTGTTCGGCCCCCGCAACCAGGCGCCGCTCAGC 

GACTTCCCCATCAAGGACGGGCTGGGGAAGTTC 



25 GGGTCACTACTCG 

CGTACCACCGCCGAAACCTTTCATGCCATGTATCCGCGCGTCGACGAATGGATCTCCGTGCGCC 
QCAAGGTCGATCCGCTGCGCGTATTCGCCTCCGACATGGCCCGACGCTTGGAGCTGCTGTAG 

>Rv3791 -TB.seq 4237162:4237923 MW:27470 

ki»i i™*GlMTBH37RV-4237162-4237926, Rv3791 SEQ ID NO:142 
ATG^TC^GATOCTOT^GGAAACCCCCAGACGGTGCTGCTGCTCGGTGGCACCTCCGAGATC 

G^GCrcGCCATCTOCGAGCGCTAGCTGCACAATTCGGCGGCCCGCATCGTGCTGGCCTGCCTG 

» CCCGACGACCCA^GGCGGGAGGACGCGGCCGCTGCGATGAAGCAGGCCGGCGCGCGGTCGG 

TGGAGCTGAT^GACTTTGACGCCCTGGATACCGACAGCCACCCGAAGATGATCGAGGCGGCCT 

TCTC^GGCGGTGATGTGGACGTGGCTATCGTCGCGTTCGGCTTGCTCGGCGACGCCGAAGAGC 
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TGTGGCAGAACCAGCGCAAGGCGGTGCAGATCGCCGAAATCAACTACACCGCAGCGGTTTCGG 

TGGGCGTGCTGCTGGCTGAGAAGATGCGCGCTCAGGGCTTCGGTCAGATCATCGCGATGAGCT 

CGGCCGCCGGTGAGCGGGTGCGACGGGCGAACTTCGTCTACGGCTCCACCAAGGCCGGTCTG 

GACGGGTTTTACCTGGGGTTGTCAGAAGCGCTGCGCGAGTACGGTGTTCGTGTGCTGGTGATC 

CGGCCCGGCCAGGTGCGTACCCGGATGAGCGCGCACCTCAAGGAAGCTCCATTGACCGTCGA 

CAAGGAGTACGTCGCCAACCTCGCGGTGACCGCGTCCGCAAAAGGTAAGGAATTGGTTTGGGC 

GCCAGCAGCGTTCCGCTACGTCATGATGGTGTTGCGTCACATCCCGCGGAGCATCTTCCGCAA 

GCTGCCCATCTGA 

>Rv3794 embATB.seq 4243230:4246511 MW:1 15694 
>emb|AL123456|MTBH37RV:424323(M246514. embA SEQ ID NO:143 

GTGCCCCACGACGGTAATGAGCGATCTCACCGGATCGCACGCCTAGCAGCCGTCGTCTCGGGA 

ATCGCGGGTCTGCTGCTGTGCGGCATCGTTCCGCTGCTTCCGGTGAACCAAACCACCGCGACC 

ATCTTCTGGCCGCAGGGCAGCACCGCCGACGGCAACATCACCCAGATCACCGCCCCTCTGGTA 

TCCGGGGCGCCACGCGCGCTGGACATCTCGATCCCCTGCTCGGCCATCGCCACGCTGCCCGC 

CAACGGCGGCCTGGTGCTGTCCACACTGCCGGCCGGTGGCGTGGATACCGGTAAGGCCGGGC 

TGTTCGTCCGCGCCAACCAGGACACGGTCGTCGTGGCGTTCCGCGACTCGGTGGCCGCGGTG 

GCGGCCCGCTCCACGATCGCAGCGGGAGGCTGTAGCGCGCTGCATATCTGGGCCGATACCGG 

CGGCGCGGGCGCTGATTTTATGGGTATACCCGGCGGCGCCGGGACCCTGCCGCCGGAGAAGA 

AGCCACAGGTTGGCGGCATCTTCACCGACCTGAAGGTCGGAGCGCAGCCCGGGCTGTCGGCC 

CGCGTCGACATCGACACTCGGTTTATCACGACGCCCGGCGCGCTCAAGAAGGCCGTGATGCTC 

CTCGGCGTGCTGGCGGTCCTGGTAGCCATGGTGGGGCTGGCCGCGCTGGACCGGCTCAGCAG 

GGGCCGCACCCTGCGCGACTGGCTGACCCGATATCGCCCGCGGGTGCGGGTCGGATTCGCCA 

GCCGGCTCGCTGACGCAGCGGTGATCGCGACCTTGTTGCTCTGGCATGTCATCGGCGCCACCT 

CGTCCGATGACGGCTACCTTCTGACCGTCGCCCGGGTCGCCCCGAAGGCCGGCTATGTAGCCA 

ACTACTACCGGTATTTCGGCACGACGGAGGCGCCGTTCGACTGGTATACATCGGTGCTTGCCCA 

GCTGGCGGCGGTGAGCACCGCCGGCGTCTGGATGCGCCTGCCCGCCACCCTGGCCGGAATCG 

CCTGCTGGCTGATCGTCAGCCGTTTCGTGCTGCGGCGGCTGGGACCGGGCCCGGGCGGGCTG 

GCGTCCAACCGGGTCGCTGTGTTCACCGCTGGTGCGGTGTTCCTGTCCGCCTGGCTGCCGTTC 

AACAACGGCCTGCGTCCCGAGCCGCTGATCGCGCTGGGTGTGCTGGTCACGTGGGTGTTGGTG 

GAACGGTCGATCGCGCTCGGACGGCTGGCCCCGGCCGCGGTAGCCATCATCGTGGCGACGCT 

TACCGCGACGCTGGCACCGCAGGGGTTGATCGCGCTGGCCCCGCTGCTGACTGGTGCGCGCG 

CCATCGCCCAGAGGATCCGGCGCCGCCGGGCGACCGATGGACTGCTGGCGCCGCTGGCGGT 

GCTGGCCGCGGCGTTGTCGCTGATCACCGTGGTGGTGTTTCGGGACCAGACGCTGGCCACGGT 

GGCCGAATCGGCACGCATCAAGTACAAGGTCGGCCCGACCATCGCCTGGTACCAGGACTTCCT 

GCGCTACTACTTCCTTACCGTGGAGAGCAACGTTGAGGGGTCGATGTCCCGCCGGTTCGCGGT 

GCTGGTGTTGCTGTTCTGCCTGTTCGGGGTGCTGTTCGTGCTGCTGCGGCGCGGCCGGGTGGC 
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GGGGCTGGCCAGCGGCCCGGCCTGGCGACTGATCGGCACTACGGCGGTCGGCCTGCTGCTGC 
TCACGTTCACGCCAACCAAGTGGGCCGTGCAGTTCGGCGCATTCGCCGGGCTGGCCGGGGTGT 
TGGGTGCGGTCACCGCGTTCACCTTTGCCCGCATCGGTCTACATAGTCGACGCAACCTCACGCT 
GTACGTGACCGCGTTGCTGTTCGTGCTGGCGTGGGCAACCTCGGGCATCAACGGGTGGTTCTA 
5 CGTCGGCAACTACGGGGTGCCGTGGTATGACATCCAGCCCGTCATCGCCAGCCACCCGGTGAC 
GTCGATGTTTCTGACGCTGTCGATCCTCACCGGATTGCTGGCAGCCTGGTATCACTTCCGGATG 
GACTACGCCGGGCACACCGAAGTCAAAGACAACCGGCGCAACCGCATCTTGGCCTCTACGCCA 
CTGCTGGTGGTCGCGGTGATCATGGTCGCAGGCGAAGTCGGCTCGATGGCCAAGGCCGCGGT 
GTTCCGTTACCCGCTTTACACCACCGCCAAGGCCAACCTGACCGCGCTCAGCACCGGGCTGTC 
10 CAGCTGTGCGATGGCCGACGACGTGCTGGCCGAGCCCGACCCCAATGCCGGCATGCTGCAAC 
CGGTTCCGGGCCAGGCGTTCGGACCGGACGGACCGCTGGGCGGTATCAGTCCCGTCGGCTTC 
AAACCCGAGGGCGTGGGCGAGGACCTCAAGTCCGACCCGGTGGTCTCCAAACCCGGGCTGGT 
CAACTCCGATGCGTCGCCCAACAAACCCAACGCCGCCATCACCGACTCCGCGGGCACCGCCGG 
AGGGAAGGGCCCGGTCGGGATCAACGGGTCGCACGCGGCGCTGCCGTTCGGATTGGACCCGG 
15 CACGTACCCCGGTGATGGGCAGCTACGGGGAGAACAACCTGGCCGCCACGGCCACCTCGGCC 
TGGTACCAGTTACCGCCCCGCAGCCCGGACCGGCCGCTGGTGGTGGTTTCCGCGGCCGGCGC 
CATCTGGTCCTACAAGGAGGACGGCGATTTCATCTACGGCCAGTCCCTGAAACTGCAGTGGGG 
CGTCACCGGCCCGGACGGCCGCATCCAGCCACTGGGGCAGGTATTTCCGATCGACATCGGACC 
GCAACCCGCGTGGCGCAATCTGCGGTTTCCGCTGGCCTGGGCGCCGCCGGAGGCCGACGTGG 
20 CGCGCATTGTCGCCTATGACCCGAACCTGAGCCCTGAGCAATGGTTCGCCTTCACCCCGCCCC 
GGGTTCCGGTGCTGGAATCTCTGCAGCGGTTGATCGGGTCAGCGACACCGGTGTTGATGGACA 
TCGCGACCGCAGCCAACTTCCCCTGCCAGCGACCGTTTTCCGAGCATCTCGGCATTGCCGAGC 
TTCCGCAGTACCGGATCCTGCCGGACCACAAGCAGACGGCGGCGTCGTCGAACCTATGGCAGT 
CCAGCTCGACCGGCGGTCCGTTCCTGTTCACCCAGGCGCTGCTGCGCACCTCGACGATCGCCA 
25 CGTACCTGCGTGGGGACTGGTATCGCGACTGGGGATCGGTGGAGCAGTACCACCGGCTGGTG 
CCGGCCGATCAGGCTCCAGACGCCGTTGTCGAGGAGGGCGTGATCACTGTGCCCGGCTGGGG 

TCGGCCAGGACCGATCAGGGCGCTGCCATGA 

>RV3795 embB TB.seq 424651 1:4249804 MW:1 18023 
30 >emb|AL123456|MTBH37RV:424651 1-4249807, embB SEQ ID NO:144 

ATGACACAGTGCGCGAGCAGACX3CAAAAGCACCCCAAATCGGGCGATTTTGGGGGCTTTTGCG 

TCTGCTCGCGGGACGCGCTGGGTGGCCACCATCGCCGGGCTGATTGGCTTTGTGTTGTCGGTG 
GCGACGCCGCTGCTGCCCGTCGTGCAGACCACCGCGATGCTCGACTGGCCACAGCGGGGGCA 
ACTGGGCAGCGTGACCGCCCCGCTGATCTCGCTGACGCCGGTCGACTTTACCGCCACCGTGCC 
35 GTGCGACGTGGTGCGCGCCATGCCACCCGCGGGCGGGGTGGTGCTGGGCACCGCACCCAAG 
CAAGGCAAGGACGCCAATTTGCAGGCGTTGTTCGTCGTCGTCAGCGCCCAGCGCGTGGACGTC 
ACCGACCGCAACGTGGTGATCTTGTCCGTGCCGCGCGAGCAGGTGACGTCCCCGCAGTGTCAA 

150 



BNSDOCIO: <WO 013S317A1 I > 



WO 01/35317 



PCTAJSOO/31152 



CGCATCGAGGTCACCTCTACCCACGCCGGCACCTTCGCCAACTTCGTCGGGCTCAAGGACCCG 

TCGGGCGCGCCGCTGCGCAGCGGCTTCCCCGACCCCAACCTGCGCCCGCAGATTGTCGGGGT 

GTTCACCGACCTGACCGGGCCCGCGCCGCCCGGGCTGGCGGTCTCGGCGACCATCGACACCC 

GGTTCTCCACCCGGCCGACCACGCTGAAACTGCTGGCGATCATCGGGGCGATCGTGGCCACCG 

TCGTCGCACTGATCGCGTTGTGGCGCCTGGACCAGTTGGACGGGCGGGGCTCAATTGCCCAGC 

TCCTCCTCAGGCCGTTCCGGCCTGCATCGTCGCCGGGCGGCATGCGCCGGCTGATTCCGGCAA 

GCTGGCGCACCTTCACCCTGACCGACGCCGTGGTGATATTCGGCTTCCTGCTCTGGCATGTCAT 

CGGCGCGAATTCGTCGGACGACGGCTACATCCTGGGCATGGCCCGAGTCGCCGACCACGCCG 

GCTACATGTCCAACTATTTCCGCTGGTTCGGCAGCCCGGAGGATCCCTTCGGCTGGTATTACAA 

CCTGCTGGCGCTGATGACCCATGTCAGCGACGCCAGTCTGTGGATGCGCCTGCCAGACCTGGC 

CGCCGGGCTAGTGTGCTGGCTGCTGCTGTCGCGTGAGGTGCTGCCCCGCCTCGGGCCGGCGG 

TGGAGGCCAGCAAACCCGCCTACTGGGCGGCGGCCATGGTCTTGCTGACCGCGTGGATGCCG 

TTCAACAACGGCCTGCGGCCGGAGGGCATCATCGCGCTCGGCTCGCTGGTCACCTATGTGCTG 

ATCGAGCGGTCCATGCGGTACAGCCGGCTCACACCGGCGGCGCTGGCCGTCGTTACCGCCGC 

ATTCACACTGGGTGTGCAGCCCACCGGCCTGATCGCGGTGGCCGCGCTGGTGGCCGGCGGCC 

GCCCGATGCTGCGGATCTTGGTGCGCCGTCATCGCCTGGTCGGCACGTTGCCGTTGGTGTCGC 

CGATGCTGGCCGCCGGCACCGTCATCCTGACCGTGGTGTTCGCCGACCAGACCCTGTCAACGG 

TGTTGGAAGCCACCAGGGTTCGCGCCAAAATCGGGCCGAGCCAGGCGTGGTATACCGA GAACC 

TGCGTTACTACTACCTCATCCTGCCCACCGTCGACGGTTCGCTGTCGCGGCGCTTCGGCTTTTT 

GATCACCGCGCTATGCCTGTTCACCGCGGTGTTCATCATGTTGCGGCGCAAGCGAATTCCCAGC 

GTGGCCCGCGGACCGGCGTGGCGGCTGATGGGCGTCATCTTCGGCACCATGTTCTTCCTGATG 

TTCACGCCCACCAAGTGGGTGCACCACTTCGGGCTGTTCGCCGCCGTAGGGGCGGCGATGGC 

CGCGCTGACGACGGTGTTGGTATCCCCATCGGTGCTGCGCTGGTCGCGCAACCGGATGGCGTT 

CCTGGCGGCGTTATTCTTCCTGCTGGCGTTGTGTTGGGCCACCACCAACGGCTGGTGGTATGTC 

TCCAGCTACGGTGTGCCGTTCAACAGCGCGATGCCGAAGATCGACGGGATCACAGTCAGCACA 

ATCTTTTTCGCCCTGTTTGCGATCGCCGCCGGCTATGCGGCCTGGCTGCACTTCGCGCCCCGC 

GGCGCCGGCGAAGGGCGGCTGATCCGCGCGCTGACGACAGCCCCGGTACCGATCGTGGCCG 

GTTTCATGGCGGCGGTGTTCGTCGCGTCCATGGTGGCCGGGATCGTGCGACAGTACCCGACCT 

ACTCCAACGGCTGGTCCAACGTGCGGGCGTTTGTCGGCGGCTGCGGACTGGCCGACGACGTA 

CTCGTCGAGCCTGATACCAATGCGGGTTTCATGAAGCCGCTGGACGGCGATTCGGGTTCTTGG 

GGCCCCTTGGGCCCGCTGGGTGGAGTCAACCCGGTCGGCTTCACGCCCAACGGCGTACCGGA 

ACACACGGTGGCCGAGGCGATCGTGATGAAACCCAACCAGCCCGGCACCGACTACGACTGGGA 

TGCGCCGACCAAGCTGACGAGTCCTGGCATCAATGGTTCTACGGTGCCGCTGCCCTATGGGCT 

CGATCCCGCCCGGGTACCGTTGGCAGGCACCTACACCACCGGCGCACAGCAACAGAGCACACT 

CGTCTCGGCGTGGTATCTCCTGCCTAAGCCGGACGACGGGCATCCGCTGGTCGTGGTGACCGC 

CGCGGGCAAGATCGCCGGCAACAGCGTGCTGCACGGGTACACCCCCGGGCAGACTGTGGTGC 

TCGAATACGCCATGCCGGGACCCGGAGCGCTGGTACCCGCCGGGCGGATGGTGCCCGACGAC 
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CTATACGGAGAGCAGCCCAAGGCGTGGCGCAACCTGCGCTTCGCCCGAGCAAAGATGCCCGC 

CGATGCCGTCGCGGTCCGGGTGGTGGCCGAGGATCTGTCGCTGACACCGGAGGACTGGATCG 

CGGTGACCCCGCCGCGGGTACCGGACCTGCGCTCACTGCAGGAATATGTGGGCTCGACGCAG 

CCGGTGCTGCTGGACTGGGCGGTCGGTTTGGCCTTCCCGTGCCAGCAGCCGATGCTGCACGC 

CAATGGCATCGCCGAAATCCCGAAGTTCCGCATCACACCGGACTACTCGGCTAAGAAGCTGGAC 

ACCGACACGTGGGAAGACGGCACTAACGGCGGCCTGCTCGGGATCACCGACCTGTTGCTGCG 

GGCCCACGTCATGGCCACCTACCTGTCCCGCGACTGGGCCCGCGATTGGGGTTCCCTGCGCAA 

GTTCGACACCCTGGTCGATGCCCCTCCCGCCCAGCTCGAGTTGGGCACCGCGACCCGCAGCG 

GCCTGTGGTCACCGGGCAAGATCCGAATTGGTCCATAG 

>Rv3834c serS seryWRNA synthase TB.seq 4307655:430891 1 MW:45293 
>emb|AL123456|MTBH37RV:c430891 1-4307652. serS SEQ ID NO:145 

GTGATCGACCTGAAGCTGCTTCGTGAAAACCCCGACGCGGTACGCCGCTCACAACTCAGCCGC 

GGCGAGGACCCGGCGCTGGTAGATGCCCTGCTGACGGCCGACGCCGCCCGCCGGGCCGTGA 

TCTCGACCGCCGATTCGTTACGGGCCGAGCAGAAAGCCGCCAGCAAAAGCGTGGGTGGCGCG 

TCTCCCGAAGAGCGCCCGCCGCTGCTGCGGCGCGCGAAGGAACTCGCCGAGCAGGTCAAAGC 

CGCTGAGGCCGACGAGGTCGAAGCGGAGGCGGCGTTCACCGCGGCGCACCTGGCGATCTCGA 

ATGTCATCGTGGACGGGGTACCCGCCGGCGGGGAGGACGACTACGCGGTGCTCGACGTCGTC 

GGCGAGCCCAGCTACCTCGAGAACCCCAAGGACCACCTGGAGCTCGGCGAGTCGCTGGGCCT 

GATCGACATGCAGCGCGGCGCCAAGGTGTCGGGTTCACGGTTCTACTTCCTGACCGGTCGGGG 

TGCCCTACTGCAGCTTGGATTGCTGCAGCTGGCGCTGAAGCTAGCCGTCGACAACGGCTTTGTC 

CCTACGATCCCGCCGGTGCTGGTGCGCCCGGAAGTGATGGTAGGCACGGGATTTCTAGGCGCC 

CACGCCGAGGAGGTGTACCGGGTAGAGGGCGACGGCCTCTACCTTGTGGGCACCTCCGAGGT 

ACCGCTGGCGGGGTATCACTCCGGCGAGATTCTGGACCTTTCCCGCGGGCCGCTGCGGTATGC 

GGGCTGGTCGTCGTGTTTCCGACGTGAGGCCGGCAGCCATGGCAAGGACACGCGCGGCATCA 

TCCGGGTGCACCAGTTCGACAAAGTCGAGGGCTTCGTCTACTGCACACCGGCCGACGCGGAGC 

ACGAACATGAGCGGCTGCTGGGCTGGCAGCGCCAGATGCTGGCACGCATCGAGGTGCCGTAT 

CGGGTCATCGACGTGGCCGCGGGTGATCTCGGCTCGTCGGCCGCCCGCAAGTTCGACTGCGA 

GGCGTGGATTCCGACGCAGGGGGCCTATCGCGAGCTGACGTCGACGTCGAACTGCACCACCTT 

TCAGGCGCGCCGGTTGGCGACCCGCTACCGGGATGCCAGCGGCAAGCCGCAGATCGCGGCCA 

CCCTCAACGGAACGCTGGCCACCACCCGGTGGCTGGTTGCGATCCTGGAGAACCACCAGCGG 

CCCGACGGCAGCGTTAGAGTCCCGGACGCACTGGTTCCGTTCGTGGGTGTCGAAGTGCTGGAG 

CCGGTCGCTTAG 

>Rv3907c pcnA polynucleotide polymerase TB.seq 4391 631 :4393070 MW:53057 
>emb|AL123456|MTBH37RV:c4393070-4391628, pcnA SEQ ID NO:146 
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GTGCCGGAAGCCGTCCAGGAAGCCGATCTGCTAACCGCCGCTGCGGTTGCCTTGAACAGGCAT 

GCTGCCTTATTGCGGGAACTCGGGTCGGTGTTCGCCGCCGCGGGACACGAGTTGTATCTGGTC 

GGCGGTTCGGTGCGAGATGCACTGTTGGGCCGGTTGAGCCCCGACCTGGACTTCACCACCGAC 

GCCCGTCCCGAGCGGGTGCAGGAGATCGTGCGGCCGTGGGCCGATGCGGTGTGGGATACCG 

GAATCGAATTCGGCACCGTCGGCGTGGGTAAGAGCGACCACCGCATGGAGATCACCACATTCC 

GTGCCGACAGCTACGACCGGGTTTCGCGTCATCCAGAGGTACGTTTCGGCGATTGCCTCGAGG 

GCGATCTGGTCCGCCGCGACTTCACCACGAACGCAATGGCTGTGCGCGTCACCGCCACTGGGC 

CGGGCGAATTCCTGGATCCGCTTGGTGGCTTGGCGGCGCTGCGGGCCAAGGTGTTAGACACCC 

CGGCGGCGCCGTCGGGGTCCTTTGGCGACGATCCGTTGCGGATGCTGCGCGCCGCGCGGTTC 

GTCTCGCAACTTGGATTCGCGGTGGCGCCGCGGGTGCGCGCGGCGATCGAAGAGATGGCGCC 

GCAGTTGGCCCGAATCAGCGCCGAACGGGTGGCCGCCGAGCTGGACAAGCTGCTGGTCGGTG 

AGGATCCGGCCGCGGGTATCGACCTGATGGTGCAGAGCGGTATGGGTGCTGTGGTCTTGCCTG 

AAATCGGTGGGATGCGGATGGCGATCGACGAACATCACCAGCACAAGGACGTCTATCAGCATTC 

CTTGACCGTGCTGCGGCAGGCGATCGCGCTGGAGGACGACGGCCCGGATCTGGTGTTGCGCT 

GGGCGGCGCTGCTGCACGACATCGGCAAGCCCGCCACCCGCCGTCACGAACCCGACGGTGGG 

GTGAGCTTCCATCACCACGAAGTGGTCGGCGCCAAGATGGTGCGCAAGCGGATGCGGGCGCT 

GAAGTATTCCAAGCAGATGATCGACGACATCTCGCAGCTGGTCTACCTGCATCTGCGGTTTCAC 

GGCTACGGCGATGGGAAATGGACCGACTCTGCGGTGCGCCGCTATGTCACCGACGCCGGGGC 

CCTACTGCCACGGCTGCACAAGCTGGTGCGCGCCGACTGCACGACCCGCAACAAGCGCCGGG 

CCGCGCGGTTGCAGGCCAGTTACGACCGGCTGGAAGAGCGGATCGCGGAGCTGGCCGCCCAG 

GAGGATCTGGATCGGGTGCGCCCCGACCTGGACGGCAACCAGATCATGGCGGTGCTCGACATT 

CCGGCGGGCCCGCAAGTCGGCGAGGCGTGGCGCTACTTGAAGGAGCTGCGGCTAGAGCGCG 

GCCCGTTGTCCACCGAGGAGGCGACAACCGAGCTGCTGTCCTGGTGGAAATCACGGGGGAAC 

CGCTAG 

TABLE 4 

>Rv0002 dnaN DNA polymerase III, b-subunit TB.seq 2052:3257 MW:42114 SEQ ID NO:147 
MDAATTRVGLTDLTFRLLRESFADAVSWVAKNLPARPAVPVLSGVLLTGSDNGLTISGFDYEVSAEA 

QVGAEIVSPGSVLVSGRLLSDITRALPNKPVDVHVEGNRVALTCGNARFSLPTMPVEDYPTLPTLPEE 

TGLLPAELFAEAISQVAIAAGRDDTLPMLTGIRVEILGETWLAATDRFRLAVRELKWSASSPDIEAAVL 

VPAKTLAEAAKAGIGGSDVRLSLGTGPGVGKDGLLGISGNGKRSTTRLLDAEFPKFRQLLPTEHTAVA 

TMDVAELIEAIKLVALVADRGAQVRMEFADGSVRLSAGADDVGRAEEDLWDYAGEPLTIAFNPTYLT 

DGLSSLRSERVSFGFTTAGKPALLRPVSGDDRPVAGLNGNGPFPAVSTDYVYLLMPVRLPG 

>Rv0003 recF DNA replication and SOS induction TB.seq 3280:4434 MW:42181 SEQ ID NO:148 
VrrVRHLGLRDFRSWACVDLELHPGRTVFVGPNGYGKTNLIEALVVYSTTLGSHRVSADLPLIRVGTDR 

AVISTIWNDGRECAVDLEIATGRVNKARLNRSSVRSTRDWGVLRAVLFAPEDLGLVRGDPADRRR 
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YLDDLA.VRRPAIAAVRAEYER^RQRTALLKSVPGARYRGDRGVFDTLEWTOSRLAEHGAELVAARl 

DLVNQLAPEVKKAYQLLAPESRS ASI GYRASMDVTGPSEQSDI DRQU-AARLLAALAARRD AELERG 
VCLVGPHRDDULRLGDQPAKGFASHGEAWSLAVALRLAAYQLLRVDGGEPVLLLDDVFAELDVMRR 

RALAT AAES AEQ VLVT AAVLED1 P AG WD ARRVH I DVRADDTG S M S WL P 

>Rv0005 gyrB DNAgyrase subunit B TB.seq 5123:7264 MW:78441 SEQ ID NO:149 
MGKNEARRSALAPDHGTWCDPLRRLNRMHATPEESIRIVAAQKKKAQDEYGAASITILEGLEAVRKR 

PGMYIGSTGERGLHHLIVVEVN^NAVDEAMAGYATWNWLLEDGGVEVADDGRGIPVATHASGIPTV 
DWI^QLHAGGKFDSDAYAISGGLHGVGVSVVNAJ-STRLEVEIKRDGYEWSQVYEKSEPLGLKQGA 
PTKKTGSWRFWADPAVFETTEYDFEWARRLQEMAFLNKGLTINLTDERVTQDEVVDEWSDVAEA 
PKSASERAAESTAPHKVKSRTFHYPGGLVDFVKH1NRTKNAIHSSIVDFSGKGTGHEVEIAMQWNAG 
YSESVHTFANTINTHEGGTHEEGFRSALTSWNKYAKDRKLLKDKDPNLTGDDIREGLAAVISVKVSE 
PQFEGQTKTKLGNTEVKSFVQKVCNEQLTHWFEANPTDAKVWNKAVSSAQARIAARKARELVRRK 
SATDIGGLPGKLADCRSTDPRKSELYWEGDSAGGSAKSGRDSMFQAILPLRGKIINVEKARIDRVI-K 
NTEVQAIITALGTGIHDEFDIGKLRYHKIVLMADADVDGQHISTLLLTLLFRFMRPLIENGHVFLAQPPLY 
KLKWQRSDPEFAYSDRERDGLLEAGLKAGKK.NKEDG1QRYKGLGEMDAKELWETTMDPSVRVLRQ 
VTLDDAAAADELFSILMGEDVDARRSFITRNAKDVRFLDV 

>Rv0006 gyrA DNAgyrase subunit A TB.seq 7302:9815 MW.92276 SEQ ID NO: 150 
MTDTTLPPDDSLDR»EP\miEQEMQRSYIDYAMSV.VGRALPEVRDGLKPVHRRVLYAMFDSGFRPD 

RSHAKSARSVAETMGNYHPHGDASIYDSLVRMAQPWSLRYPLVDGQGNFGSPGNDPPAAMRYTEA 

RLTPLAMEMLREIDEETVDFIPNYDGRVQEPTVLPSRFPNLLANGSGG.AVGMATN.PPHNLRELADA 

VFWALENHDADEEETLAAVMGRVKGPDFPTAGLIVGSQGTADAYKTGRGSIRMRGWEVEEDSRG 

RTSLVITELPYQVNHDNF.TSIAEQVRDGKLAG.SN.EDQSSDRVGLRIV.EIKRDAVAKVV.NNLYKHTQ 

25 LQTSFGANMLAIVDGVPRTLRLDQLIRYWDHQLDVIVRRTTYRLRKANERAHILRGLVKALDALDEVI 

" ALI RASETVDI ARAGLIELLDI DEI QAQAILDMQLRRLAALERQRI I DDLAKI EAEI ADLEDI LAKPERQRGI 
VRDELAEIVDRHGDDRRTRIIAADGDVSDEDLIAREDVWTITETGYAKRTKTDLYRSQKRGGKGVQG 

AGLKQDDIVAHFFVCSTHDLILFFTTQGRVYRAKAYDLPEASRTARGQHVANLl-AFQPEERIAQVIQIR 

GYTDAPYLN/LATRNGLVKKSKLTDFDSNRSGGIVAVNLRDNDELVGAVLCSAGDDLLLVSANGQSIR 

FSATDEALRPMGRATSGVQGMRFNIDDRLLSLNWREGTYLLVATSGGYAKRTAIEEYPVQGRGGK 

GVLWMYDRRRGRLVGALIVDDDSELYAWSGGGVIRTAARQVRKAGRQTKGVRLMNLGEGDTLLAI 

ARNAEESGDDNAVDANGADQTGN 

>Rv0014c pknB serine-threonine protein kinase TB.seq 15593:17470 MW:66511 SEQ ID NO:151 
35 t/rTTPSHLSDRYELGEILGFGGMSEVHLARDLRLHRDVAVKVLRADl^RDPSFYLRFRREAQNAAALN 
HPAIVAWDTGEAETPAGPLPYIVMEYVDGVTLRDIVHTEGPMTPKRAIEVIADACQALNFSHQNGIIH 
RDVKPANIMISATNAVKVMDFGIARAIADSGNSVTQTAAVIGTAQYLSPEQARGDSVDARSDVYSLGC 
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VLYEVLTGEPPFTGDSPVSVAYQHVREDPI PPSARHEGLSADLDAWLKALAKNPEN RYQTAAEMRA 
DLVRVHNGEPPEAPKVLTDAERTSLLSSAAGNLSGPRTDPLPRQDLDDTDRDRSIGSVGRWVAWA 
VLAVLTVWTIAINTFGGITRDVQVPDVRGQSSADAIATLQNRGFKIRTLQKPDSTIPPDHVIGTDPAAN 
TSVSAGDEITVNVSTGPEQREIPDVSTLTYAEAVKKLTAAGFGRFKQANSPSTPELVGKVIGTNPPAN 
QTSAITNWIIIVGSGPATKDIPDVAGQWDVAQKNLNVYGFTKFSQASVDSPRPAGEVTGTNPPAGT 
TVPVDSVIELQVSKGNQFVMPDLSGMFVVVDAEPRLRALGVVTGMLDKGADVDAGGSQHNRWYQN 

PPAGTGVNRDGIITLRFGQ 

>Rv0016c pbpA TB.seq 18762:20234 MW:51577 SEQ ID NO:152 
MNASLRRISVTVMALIVLLLLNATOTQVFTADGLRA^ 

DGRFRFLRVYPNPEVYAPVTGFYSLRYSSTALERAEDPILNGSDRRLFGRRLADFFTGRDPRGGNV 

DTTI NPRIQQAGWDAMQQGCYGPCKG AWALEPSTGKI LALVSSPSYDPNLLASH N PEVQAQAWQR 

LGDNPASPLTNRAISETYPPGSTFKVITTAAALAAGATETEQLTAAPTIPLPGSTAQLENYGGAPCGDE 

PTVSLREAFVKSCNTAFVQLGIRTGADALRSMARAFGLDSPPRPTPLQVAESTVGPIPDSAALGMTSI 

GQKDVALTPLANAEIAATIANGGITMRPYLVGSLKGPDLANISTTVGYQQRRAVSPQVAAKLTELMVG 

AEKVAQQKGAIPGVQIASKTGTAEHGTDPRHTPPHAWYIAFAPAQAPKVAVANA-VENGADRLSATGG 

ALAAPIGRAVI EAALQGEP 

>Rv0017c rodA TB.seq 20234:21640 MW:50612 SEQ ID NO:153 

MTTRLQAPVAVTPPLPTRRNAELL1XCFAAVITFAALLWQANQDQGVPWDLTSYGLAFLTLFGSAHL 

AIRRFAPYTDPLLLPWALLNGLGLVMIHRLDLVDNEIGEHRHPSANQQMLVVnTLVGVAAFALVVTFLK 

DHRQLARYGYICGU^GLVFI^VPALi.PAALSEQNGAKIWIRLPGFSIQPAEFSKILLLIFFSAVLVAKRG 

LFTSAGKHLLGMTLPRPRDLAPLLAAVVVISVGVMVFEKDLGASLLLYTSFLWVYl^TQRFSVVW 

TLFAAGTLVAYFIFEHVRLRVQTWLDPFADPDGTGYQIVQSLFSFATGGIFGTGLGNGQPDTVPAAST 

DFI I AAFGEELGLVGLTAI LMLYTIVI I RGLRTAI ATRDSFGKLLAAGLSSTLAIQLFI WGGVTRLI PLTGLT 

TPWMSYGGSSLLANYILLAILARISHGARRPLRTRPRNKSPITAAGTEVIERV 

>Rv0018c ppp TB.seq 21640:23181 MW:53781 SEQ ID NO:154 

VARVTLVLRYAARSDRGLVRANNEDSVYAGARLLALADGMGGHAAGEVASQLVIAALAHLDDDEPG 

GDLLAKLDAAVRAGNSAIAAQVEMEPDLEGMGTTLTAILFAGNRLGLVHIGDSRGYLLRDGELTQITK 

DDTFVQTLVDEGRITPEEAHSHPQRSLIMRALTGHEVEPTLTMREARAGDRYLLCSDGLSDPVSDETI 

LEALQIPEVAESAHRLIEUU-RGGGPDNVTVWADWDYDYGQTQPILAGAVSGDDDQLTLPNTAAG 

RASAISQRKEIVKRVPPQADTFSRPRWSGRRL^FWAL\n*VLMTAGLLIGRAIIRSNYYVADYAGSVSI 

MRG.QGSLLGMSLHQPYLMGCLSPRNELSQISYGQSGGPLDCHLMKLEDLRPPERAQVRAGLPAGT 

LDDAIGQLRELAANSLLPPCPAPRATSPPGRPAPPTTSETTEPNVTSSPASPSPTTSAPAPTGTTPAIP 

TSASPAAPASPPTPWPVTSSPTMAALPPPPPQPGIDCRAAA 
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>Rv0019c-TB.s q 23273:23737 MW:17153 SEQ ID NO:155 

MQGL\A-QLTI^GFLMLI.V\A/FIWSVLRILKTDIYAPTGAVMMRRGLALRGTLLGARQRRHAARYLWT 
EGALTGAR.TLSEQPVL.GRADDSTLNA.TDDYASTRHARLSMRGSEWYVEDLGSTNGTYLD^ 

AVRVPIGTPVRIGKTAIELRP 

>RvO020c - TB.seq 23864:25444 MW.56881 SEQ ID NO:156 

MGSQKRLVQRVERKLEQTVGDAFARIFGGSIVPQEVEALLRREAADGIQSLQGNRLLAPNEY1ITLGV 

HDFEKLGADPELKSTGFARDLADYIQEQGWQTYGDVWRFEQSSNLHTGQFRARG'TVNPDVETHP 

PV.DCARPQSNHAFGAEPGVAPMSDNSSYRGGQGQGRPDEYYDDRYARPQEDPRGGPDPQGGS 

DPRGGYPPETGGYPPQPGYPRPRHPDQGDYPEQIGYPDQGGYPEQRGYPEQRGYPDQRGYQDQ 

GRGYPDQGQGGYPPPYEQRPPVSPGPAAGYGAPGYDQGYRQSGGYGPSPGGGQPGYGGYGEY 

GRGPARHEEGSYVPSGPPGPPEQRPAYPDQGGYDQGYQQGATTYGRQDYGGGADYTRYTESPR 

VPGYAPQGGGYAEPAGRDYDYGQSGAPDYGQPAPGGYSGYGQGGYGSAGTSVTLQLDDGSGRT 

YQLREGSN.IGRGQDAQFRLPDTGVSRRHLE.RWDGQVALLADLNSTNGTTVNNAPVQEWQLADGD 

VIRLGHSEIIVRMH 

>RV0032 bioF2 Cter^ina. similar to B. subtilis BioF TB.seq 34295:36607 MW:86245 

MPTGLGYDFLRPVEDSGI NDLKHYYFMADLADGQPLGRANLYSVCFDLATTDRKLTPAWRTTl 
PGFMTFRFLECGLLTMVSNPLALRSDTDLERVLPVLAGQMDQLAHDDGSDFLMIRDVDPEHYQRYL 

d!lrplgfrpalgfsrvdttiswssveealgclshkrrlpij<tslefrerfgieveeldeyaeh 

RLWRNVKTEAKDYQREDU^PEFFAACSRHLHGRSRLW^ 

DRDFEHYRKANLYRAALMLSLKDAISRDKRRMEMG.TNYFTKLR.PGARV,PT.YFLRHSTDFA/H^ 

ARMMMHN.QRPTLPDDMSEEFCRWEER.RLDQDGLPEHD.FRK.DRQHKYTGLKLGGVYGFYPRFT 

GPQRSTVKAAELGEiVLLGTNSYLGLATHPEWEASAEATRRYGTGCSGSPLLNGTLDLHVSLEQEL 

ACFLGKPAAVLCSTGYQSNLAAiSALCESGDMIIQDALNHRSLFDAARLSGADFTLYRHNDMDHLARV 

LRRTEGRRRIIWDAVFSMEGWADLATIAEI^DRHGCRVYVDESHALGVLGPDGRGASAALGVLAR 

MDWMGTFSKSFASVGGFIAGDRPWDYIRHNGSGHVFSASLPPAAAAATHAALRVSRR^ 

VAJkAAEYMATGU^RQGYQAEYHGTAIVPVILGNPTVAHAGYLRLMRSGVWNPVAPPAW 

TSYLADHRQSDLDRALHVFAGLAEDLTPQGAAL 

>Rv0050 P onA1 TB.seq 53661:55694 MW:71119 SEQ ID NO:158 JLJWD ™ /1AA 
W.LLPMWFTMAYL.VDVPKPGD.RTNQVST.LASDGSE.AKIVPPEGNRVDVNLSQVPMHVRQAVIAA 

EDRNFYSNPGFSFTGFARAVKNNLFGGDLQGGST.TQQYVKNALVGSAQHGWSGLMRKAKELV.AT 

KMSGEWSKDDVLC^YLNIIYFGRGAYGISAASKAYFDKPVEQLTVAEGALLAALIRRPSTLDPAVDPE 

GAHARWNVVVLDGMVETKALSPNDRAAQVFPEWPPDI^RAENQTKGPNGLIERQNn'RELJ.ELFNI^ 

EQTLNTQGLNAmTIDPQAQRAAEKAVAKYLDGQDPDMRAAWSIDPHNGAVRAYYGGDNANGFDF 
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AQAGLQTGSSFKVFALVAALEQGIGLGYQVDSSPLTVDGIKITNVEGEGCGTCNIAEALKMSLNTSYY 

RLMLKLNGGPQAVADAAHQAGIASSFPGVAHTLSEDGKGGPPNNGIVLGQYQTRV1DMASAYATLAA 

SGIYHPPHFVQKWSANGQVLFDASTADNTGDQRIPKAVADNVTAAMEPIAGYSRGHNLAGGRDSA 

aktgttqfgdttankdawwgytpslstavvw^^ 
s lkgtsnetfpkptevggyagvpppppppevppset^qptwiapgitipigppttitlappppappaat 

PTPPP 

>Rv0051 -TB.seq 55694:57373 MW:61210 SEQlDNO:159 

\rrGALSQSSNISPLPLAADLRSADNRDCPSRTDVLGAALANWGGPVGRHALIGRTRLMTPLRVMFAl 
10 ALVFLALGWSTKAACLQSTGTGPGDQRVANWDNQRAYYQLCYSDTVPLYGAELLSQGKFPYKSSWI 
ETDSNGTPQLRYDGQIAVRYMEYPWTGIYQYLSNMIAKTYTALSKVAPLPWAEWMFFNVAAFGLA 
IJVWLTTVWATSGLAGRRIWDAALVAASPLVIFQIFTNFDALA^ 

SAAKLYPU.FLYPLU_LGIRAGRLNAU^TMAAAAATWLLVNLPVMLLFPRGWSEFFRLNTRRGDDM 
DSLYNWKSFTGWRGFDPTLGnWEPPL\^NTVVTLJ.FVLCCAAIAYIALTAPHRPRVAQLTFLTVASFL 

1 5 LV NKVWSPQFSLV\fl-VPLAVl^PHRRILLAW^^ 

IAVMVLCGLWWQIYRPGRDLVRTGGPGALPACGGVDDPVGGVFANAADAPPGRLPSWLRPRLGD 

EHARERTPDAGRDRTFSGQHRA 
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>Rv0106 - TB.seq 124372:125565 MW:43701 SEQ ID NO:1 60 

MRTPVILVAGQDHTDEVTGALLRRTGTVWEHRFDGHWRRMTATLSRGELITTEDALEFAHGCVSC 

TIRDDU.VLLRRLHRRDNVGRIWHLAPWLEPQPICWA1DHVRVCVGHGYPDGPAALDVRVAAWTC 

VDCVRWLPQSLGEDELPDGRWAQNTTVGQAEFADLLVLTHPEPVAVAVLRRLAPRARITGGVDRVEL 

ALAHLDDNSRRGRTDTPHTPLLAGLPPLAADGEVAIVEFSARRPFHPQRLHAAVDLLLDGWRTRGR 

LWLANRPDQVMWLESAGGGLRVASAGKWLAAMAASEVAYVDLERRLFADLMVVVYPFGDRHTAMT 

VLVCGADPTDIVNALNAALLSDDEMASPQRWQSYVDPFGDWHDDPCHEMPDAAGEFSAHRNSGES 



>Rv0125 - TB.seq 151146:152210 MW.34927 SEQ ID NO:161 

MSNSRRRSLRWSWLLSVLAAVGLGLATAPAQAAPPALSQDRFADFPALPLDPSAMVAQVGPQWNI 

NTKLGYNNAVGAGTGIVIDPNGWLTNNHVIAGATDINAFSVGSGQTYGVDWGYDRTQDVAVLQLR 

GAGGLPSAAIGGGVAVGEPWAMGNSGGQGGTPRAVPGRWALGQTVC3ASDSLTGAEETLNGLIQ 

FDAAIQPGDSGGPWNGLGQWGMNTAASDNFQLSQGGQGFAIPIGQAMAIAGQIRSGGGSPTVHI 

GPTAFLGLGWDNNGNGARVQRWGSAPAASLGISTGDVITAVDGAPINSATAMADALNGHHPGDVI 

SVTWQTKSGGTRTGNVTLAEGPPA 

>Rv0350 dnaK 70 kD heat shock protein, chromosome replication TB.seq 419833:421707 
MW:66832 SEQ ID NO: 162 
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MARAVGIDLGTTNSWSVLEGGDPVWANSEGSRTTPSIVAFARNGEVLVGQPAKNQAVTNVDRTV 

RSVKRHMGSDWSIEIDGKKYTAPEISARILMKLKRDAEAYLGEDITDAVITTPAYFNDAQRQATKDAG 

QIAGLNVLR1VNEPTAAALAYGLDKGEKEQRILVFDLGGGTFDVSLLEIGEGWEVRATSGDNHLGGD 

DWDQRWDWLVDKFKGTSGIDLTKDKMAMQRLREAAEKAK1ELSSSQSTSINLPYITVDADKNPLFLD 

EQLTRAEFQRITQDLLDRTRKPFQSVIADTGISVSEIDHWLVGGSTRMPAVTDLVKELTGGKEPNKG 

VNPDEWAVGAALQAGVLKGEWDVLLLDVTPLSLGIETKGGVMmLIERNnTIPTKRSETFTTADDN 

QPSVQIQWQGEREIAAHNKLLGSFELTGIPPAPRGIPQIEVTFDIDANGIVHVTAKDKGTGKENTIRIQ 

EGSGLSKEDIDRMIKDAEAHAEEDRKRREEADVRNQAETLVYQTEKFVKEQREAEGGSKVPEDTLN 

KVDAAVAEAKAALGGSDISAIKSAMEKLGQESQALGQAIYEAAQAASQATGAAHPGGEPGGAHPGS 

ADDWDAEWDDGREAK 

>Rv0351 grpE stimulates DnaKATPase activity TB.seq 421707:422411 MW:24501 
SEQIDNO:163 

VTDGNQKPDGNSGEQVTWDKRRIDPETGEVRHVPPGDMPGGTAAADAAHTEDKVAELTADLQRV 
QADFANYRKRALRDQQAAADRAKASWSQLLGVLDDLERARKHGDLESGPLKSVADKLDSALTGLG 
LVAFGAEGEDFDPVLHEAVQHEGDGGQGSKPVIGTv^RQGYQLGEQVLRHALVGVVDTVWDAAE 

LESVDDGTAVADTAENDQADQGNSADTSGEQAESEPSGS 

>RV0352 dnaJ acts with GrpE to stimulate DnaK ATPase TB.seq 422450:423634 MW:41 346 
SEQIDNO:164 

MAQREWVEKDFYQELGVSSDASPEEIKRAYRKLARDLHPDANPGNPAAGERFKAVSEAHNVLSDPA 

KRKEYDETRRLFAGGGFGGRRFDSGFGGGFGGFGVGGDGAEFNLNDLFDAASRTGGTTIGDLFGG 

LFGRGGSARPSRPRRGNDLETETELDFVEAAKGVAMPLRLTSPAPCTNCHGSGARPGTSPKVCPTC 

NGSGVINRNQGAFGFSEPCTDCRGSGSUEHPCEECKGTGVTTRTRTINVRIPPGVEDGQRIRLAGQ 

GEAGLRGAPSGDLYNm/HVRPDKIFGRDGDDLTNAVPVSFTELALGSTLSVPTLDGTVGVRVPKGTA 

DGRILRVRGRGVPKRSGGSGDLLvTVKN/AVPPNLAGAAQEALEAYAAAERSSGFNPRAGWAGNR 

>Rv0363c fba fructose bisphosphate aldolase TB.seq 441266:442297 MW:36545 

SEQ ID NO: 165 

MPIATPEVYAEMLGQAKQNSYAFPAINCTSSETVNAAIKGFADAGSDGIIQFSTGGAEFGSGLGVKDM 

VTGAVAl^EFTHvlAAKYPVNVALHTDHCPKDKLDSWRPLLAISAQRVSKGGNPLFQSHMWDGSAV 

PIDENI^AQELLKAAAAAKIILEIEIGWGGEEDGVANEINEKLYTSPEDFEKTIEALGAGEHGKYLLAA 

TFGrWHGVYKPGNVKLRPDILAQGQQVAAAKLGLPADAKPFDFVFHGGSGSLXSEIEEALRYGVVKM 

NVDTDTQYAFTRPIAGHMFTNYDGVLKVDGEVGVKKvYDPRSYLKKAEASMSQRWQACNDLHCA 

GKSLTH 

>Rv0405 pks6TB.seq 485729:489934 MW:147615 SEQIDNO:166 

MTDGSVTADKLQKWFREYLSTHIECHPNEVSLDVPIRDLGLKSIDVLAIPGDLGDRFGFCIPDLAVWD 
NPSANDLIDSLLNQRSADSLRESHGHADRNTQGRGSINEPVAVIGVGCRFPGDIDGPERLWDFLTEK 
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KCAITAYPDRGFTNAGTFAESGGFLKDVAGFDNRFFDIPPDEALRMDPQQRLLLEVSWEALEHAGIIP 

ESLRLSRTGVFVGVSSTDYVRLVSASAQQKSTIWDNTGGSSSIIANRISYFLDIQGPSIVIDTACSSSLV 

AVHLACRSLSTWDCDIALVGGTNVLISPEPWGGFREAGILSQTGCCHAFDKSADGMVRGEGCGVIVL 

QRLSDARLEG RRI LAI LTGSAVNQDGKSNGI MAPN PSAQI GVLEN ACKSARVDPLEIGYVEAHGTGTS 

LGDRIEAHALGMVFGRKRPGSGPLMIGSIKPNIGHLEGAAGIAGLIKAVLMVERGSLLPSGGFTEPNP 

AlPFTELGLRWDELQEWPWAGRPRRAGVSSFGFGGTNAHVIVEEAGSVGADTVSGRADVGGSGG 

GVVA\WISGKTASAI^C^GRLGRWRARPALD\A^DVGYSLVSTRSVFDHRAVWGQTRDELLAGL 

AGWAGRPEAGWCGVGKPAGKTAFVFAGQGSQWLGMGSELYAAYPVFAEALDAWDELDRHLRY 

PLRDVIWGHDQDLLNTTEFAQPALFAVEVALYRU-MSWGVRPGLVLGHSVGELAAAHVAGALCLPD 

AAMLVAARGRLMQALPAGGAMFAVQAREDEVAPMLGHDVSIAAVNGPASWISGAHDAVSAIADRL 

RGQGRRVHRLAVSHAFHSALMEPMIAEFTAVAAELSVGLPTIPVISNVTGQLVADDFASADYWARHIR 

AWRFGDSVRSAHCAGASRFIEVGPGGGLTSLIEASLADAQIVSVPTLRKDRPEPVSVMTAAAQGFV 

SGMGLDWASVFSGYRPKRVELPTYAFQHQKFWLAPAPSVSDPTAAGQIGASDGGAELLASSGFAA 

RLAGRSADEQLAAAIEWCEHAAAVLGRDGAAGLDAGQAFADSGFNSLSAVELRNRLTAVTAVTLPA 

TAIFDHPTPTELAQYLITQIDGHGSSAAAAANPAERIDALTDLFLQACDAGRDADGWKMVALASNTRE 

RMSSP\mNNVSKrWALLADGISD\AA^CIPTLTSA.SDQREYRDIANAMTGRHSVYSLTLPGFDSSDAL 

PQNADMIVEWSNAIIDWGGSCRFVLSGYSSGGVLAYALCSHLSVKHQRNPLGVALIDTYLPSQIAN 

PSMNEGFSPNDTGKGLSREVIRVARMLNRLTATRLTAAATYAAIFQAWEPGRSMAPVLNIVAKDRIAT 

VENLREERINRWRTAAAEAAYSVAEVPGDHFGMMSTSSEAIATEIHDWISGLVRGPHR 

>Rv0435c - ATPase of AAA-family TB.seq 522348:524531 MW.75315 SEQ ID NO: 1 67 

VTHPDPARQLTLTARLNTSAVDSRRGWRLHPNAIAALGIREWDAVSLTGSRTTAAVAGLAAADTAV 

GTVLLDD\nT_SNAGLREGTEVIVSPVTVYGARS\n"LSGSTLATQSVPPVTLRQALLGKVMTVGDAVSL 

LPRDLGPGTSTSAASRALAAAVGIS\rtn"SELLTVTGVDPDGPVSVQPNSLVTWGAGVPAAMGTSTAG 

QVSISSPEIQIEELKGAQPQAAKLTEWLKLALDEPHLLQTLGAGTNLGVLVSGPAGVGKATLVRAVCD 

GRRLVTLDGPEIGALAAGDRVKAVASAVQAVRHEGGVLLITDADALLPAAAEPVASLILSELRTAVATA 

GWLIATSARPDQLDARLRSPELCDRELGLPLPDAATRKSLLEALLNPVPTGDLNLDEIASRTPGFWA 

DLAALVREAALRAASRASADGRPPMLHQDDLLGALTVIRPLSRSASDEVTVGDVTLDDVGDMAAAK 

QALTEAVLWPLQHPDTFARLGVEPPRGVLLYGPPGCGKTFVVRALASTGQLSVHAVKGSELMDKWV 

GSSEKAVRELFRRARDSAPSLVFLDELDALAPRRGQSFDSGVSDRWAALLTELDGIDPLRDWMLG 

ATNRPDLIDPALLRPGRLERLVFVEPPDAAARREILRTAGKSIPLSSDVDLDEVAAGLDGYSAADCVAL 

LREAALTAMRRSIDAANVTAADLATARETVRASLDPLQVASLRKFGTKGDLRS 

>Rv0436c pssA CDP-diacylglycerol-serine o-phosphatidyltransferase TB.seq 524531 :525388 
MW:31219 SEQ ID NO: 168 

MIGKPRGRRGVNLQILPSAMTVLSICAGLTAIKFALEHQPKAAMALIAAAAILDGLDGRVARILDAQSR 
MGAEIDSU^AVNFGVTPALVLWSMLSKWPVGWWVLLY 
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FFVGMPAPAGAVSMIGLLALKMQFGEG\WVn"SGWFLSFWVTGTSILLVSGIPMKKMHAVSVPPNY^ 
ALLAVLAICAAAAVLAPYLLIVWIIIAYMCHIPFAVRSQRWl^QHPEVWDDKPKQR 

YRPSMARLGLRKPGRRL 

>Rv0440 groEL 260 kD chaperonin 2 TB.seq 528606:530225 MW:56728 SEQ ID NO:169 
MAKTIAYDEEARRGLERGLNALADAVKVTLGPKGRNWLEKKWGAPTITNDGVSIAKEIELEDPYEKI 

GAELVKEVAKKTDDVAGDGTTTATVLA(^VREGLRNVAAGANPLGLKRGIEKAVEKV^LKGAK 

EVETKEQ.AATAA.SAGDQSIGDLIAEAMDKVGNEGV.TVEESNTFGLQLELTEGMRFDKGYISGYFVT 

DPERQEAVLEDPYILLVSSKVSTVKDLLPLLEKV1GAGKPLLIIAEDVEGEALSTLWNK.RGTFKSVAVK 

APGFGDRRKAMLQDMAILTGGQVISEEVGLTLENADLSLLGKARKWVTKDETTIVEGAGDTDAIAGR 

VAQIRQEIENSDSDYDREKLQERl^KLAGGVAVIKAGAATEVELXERKHRIEDAVRNAKAAVEEGIVA 

GGGVTLLQAAPTLDELKLEGDEATGANIVKVALEAPLKQIAFNSGLEPGWAEKVRNLPAGHGLNAQT 

GWEDLU^AGVADPVKVTRSALQNAASIAGLFLTTEAWADKPEKEKASVPGGGDMGGMDF 

>Rv0482 murB TB.seq 570537:571643 MW:38522 SEQIDNO:170 

MKRSGVGSLFAGAHIAEAVPLAPLTTLRVGPIARRVITCTSAEQWAALRHLDSAAKTGADRPLVFAG 

GSNLVIAENLTDLTWRLANSGITIDGNLVRAEAGAVFDDVWRAIEQGLGGLECLSGIPGSAGATPVQ 

NVGAYGAEVSDTITRVRLLDRCTGEVRVWSARDLRFGYRTSVLKHADGLAVPTWLEVEFALDPSGR 

SAPLRYGELIAALNATSGERADPQAVREAVl-ALRARKGMVLDPTDHDTWSVGSFFTNPVVTQDW 

RLAGDAATRKDGPVPHYPAPDGVKLAAGWLVERAGFGKGYPDAGAAPCRLSTKHALALTNRGGAT 

AEDWTLARAVRDGVHDVFGITLKPEPVLIGCML 

>Rv0483 - TB.seq 571708:573060 MW:47859 SEQ ID NO:171 

WIRVLFRPVSLIPVNNSSTPQSQGPISRRLALTALGFGVU^PNVLVACAGKVTKLAEKRPPPAPRLTF 
RPADSAADWPIAPISVEVGDGVVFQRVALTNSAGKWAGAYSRDRT1YTITEPLGYDTTYTWSGSAV 
GHDGKAVPVAGKFTrVAP^NAGFQLADGQTVGIAAPVIIQFDSPISDKAAVERALTNnTDPPVEGG 
WAWLPDEAQGARVHWRPREYYPAGTTVDVDAKLYGLPFGDGAYGAQDMSLHFQ.GRRQWKAEV 

SSH RIQ WTDAGVI MDFPCSYG EADLARNVTRNGI HWTEKYSDFYMSNPAAG YSHI H ERWAVRI SN 
NGEF.HANPMSAGAQGNSNVTNGC.NLSTENAEQYYRSAVYGDPVEVTGSS.QLSYADGDIWDWAV 

DWDTVWSMSALPPPAAKPAATQIPVTAPVTPSDAPTPSGTPTTTNGPGG 

>Rv0489 gpm phosphoglycerate mutase I TB.seq 578424:579170 MW:27217 SEQ ID NO:172 
MAhTTGSLVLLRHGESDWNALNLFTGVVVDVGLTDKGQAEAVRSGELIAEHDLLPDVLYTSLLRRAITT 

AHLALDSADRLWIPVRRSWRLNERHYGALQGLDKAETKARYGEEQFMAWRRSYDTPPPRERGSQ 

FSQDADPRYAD.GGGPLTECLADWARFLPYFTDVIVGDLRVGKTVL.VAHGNSLRALVKHLDQMSDD 

EIVGLNIPTGIPLRYDLDSAMRPLVRGGTYLDPEAAAAGAAAVAGQGRG 
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>Rv0490 senX 3sensor histidine kinase TB.seq 579347:580576 MW:44794 SEQ ID NO:1 73 

\AVFSALLI^GVLSALALAVGGAVGMRLTSRWEQRQRVATEWSGIWSQMLQCI\m.MPLGAAWD 

THRDWYLNERAKELGLVRDRQLDDQAWRAARQALGGEDVEFDLSPRKRSATGRSGLSVHGHARL 

LSEEDRRFAWFVHDQSDYARMEAARRDFVANVSHELKTPVGAMALLAEALLASADDSETVRRFAE 

KVLIEANRLGDMVAELIELSRLCK3AERLPNMTDVDVDTIVSEAISRHKVAADNADIEVRTDAPSNLRVL 

GDQTLLVTALANLVSNAIAYSPRGSLVSISRRRRGANIEIAVTDRGIGIAPEDQERVFERFFRGDKARS 

RATGGSGLGUMVKHVAANHDGTIRVWSKPGTGSTFTLALPALIEAYHDDERPEQAREPELRSNRSQ 

REEELSR 

>RvO500 proC pyrroline-5-carboxylate reductase TB.seq 590081 :590965 MW:301 72 
SEQ ID NO:174 

MLFGMARIAIIGGGSIGEALLSGLLRAGRQVKDLWAERMPDRANYLAQTYSVLVTSAADAVENATFV 
WAWPADVEPVIADLANATAAAENDSAEQVFVTWAGITIAYFESKLPAGTPWRAMPNAAALVGAG 
\n-ALAKGRFVTPQQLEEVSALFDAVGGVLTVPESQLDAVTAVSGSGPAYFFLLVEALVDAGVGVGLS 
RQVATDIJ^QTMAGSAAMLLERMEQDQGGANGELMGLRVDLTASRLRAAVTSPGGTTAAALRELE 

RGGFRMAVDAAVQAAKSRSEQLRITPE 

>Rv0528 -TB.seq 618303:619889 MW:57132 SEQ ID NO: 175 

MWRSLTSMGTALVLLFLLALAAIPGAU-PQRGLNAAKVDDYLAAHPLIGPWLDELQAFDVFSSFWFTA 

IYVLLFVSLVGCUKPRTIEHARSLRATPVAAPRNLARLPKHAHARLAGEPAALAATITGRLRGWRSITR 

QQGDSVEVSAEKGYLREFGNLVFHFALLGLLVAVAVGKLFGYEGNVMADGGPGFCSASPAAFDSF 

RAGNTVDGTSLHPICVRVNNFQAHYLPSGQATSFAADIDYQADPATADLIANSWRPYRLQVNHPLRV 

GGDRVYLQGHGYAPTFTVTFPDGQTRTSTVQWRPDNPQTLLSAGWRIDPPAGSYPNPDERRKHQI 

AIQGLLAPTEQLDGTLLSSRFPALNAPAVAIDIYRGDTGLDSGRPQSLFTLDHRLIEQGRLVKEKRVNL 

RAGQQVRIDQGPAAGTVVRFDGAVPFVNLQVSHDPGQSVVVLVFAITMMAGLLVSLLVRRRRVWARI 

TPTTAGTVNVELGGLTRTDNSGWGAEFERLTGRLLAGFEARSPDMAEAAAGTGRDVD 

>Rv0667 rpoB [beta] subunit of RNA polymerase TB.seq 759805:763320 MW:129220 
SEQ ID NO:176 

LADSRQSKTAASPSPSRPQSSSNNSVPGAPNRVSFAKLREPLEVPGLLDVQTDSFEWLIGSPRWRE 
SAAERGDVNPVGGLEE\A.YELSPIEDFSGSMSLSFSDPRFDDVKAPVDECKDKDMTYAAPLFVTAEF 
INNNTGEIKSQTWMGDFPMf^EKGTFIINGTERVWSQLVRSPGWFDETIDKSTDKTLHSVKVIPSR 
GAWLEFDVDKRDWGVRIDRKRRQPVTVU.KALG\^SEQIVERFGFSEIMRSTLEKDNTVGTDEALL 
DIYRKLRPGEPPTKESAQTLLENLFFKEKRYDLARVGRYKVNKKLGLHVGEPITSSTLTEEDWATIEY 
LVRLHEGQTTMWPGGVEVPVETDDIDHFGNRRLRTVGELIQNQIRVGMSRMERWRERMTTQDVE 
AITPQTLINIRPWAAIKEFFGTSQLSQFMDQNNPLSGLTHKRRLSALGPGGLSRERAGLEVRDVHPS 
HYGRMCPIETPEGPNIGLIGSLSVYARVNPFGFIETPYRKWDGWSDEIVYLTADEEDRHWAQANS 
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PIDADGRFVEPR^RKAGEVEWPSSEVDYMDVSPRQMVSVATAMIPFLEHDDANF^MGANMQ 
RQAVPLVRSEAPLVGTGMELRAAIDAGDWVAEESGVIEEVSADY1TVMHDNGTRRTYRMRKFARSN 
HGTCANQCPIVDAGDRVEAGQVIADGPCTDDGEMALGKNLLVAIMPWEGHNYEDAIILSNRLVEEDV 
LTSIHIEEHEIDARDTKLGAEEITRDIPNISDEVLADLDERGIVRIGAEVRDGDILVGKVTPKGETELTPE 
5 ERU-RA1FGEKAREVRDTSLKVPHGESGKVIGIRVFSREDEDELPAGVNELVRVYVAQKRKISDGDKL 
AGRHGNKGVIGKILPVEDMPFLADGTPVDIILNTHGVPRRMNIGQ1LETHLGWCAHSGWKVDAAKGV 
PDWAARLPDELLEAQPNAIVSTPVFDGAQEAELQGLLSCTLPNRDGDVLVDADGKAMLFDGRSGEP 
FPYPVWGYMYIMKLHHLVDDKIHARSTGPYSMITQQPLGGKAQFGGQRFGEMECWAMQAYGAAY 
JLQELLJJKSDDWGRVKVYEAIVKGENIPEPGIPESFKVLLXELQSLCLNVEVLSSDGAAIELREGEDE 

10 DLERAAANLGI NLSRNESASVEDLA 

>Rv0668 rpoC [beta]' subunit of RNA polymerase TB.seq 763368:767315 MW:146740 
SEQ ID NO:177 

VLDVNFFDELRIGI^TAEDIRQWSYGEVKKPETINYRTLXPEKDGLFCEKIFGPTRDWECYCGKYKRV 
15 RFKGIICERCGVE\nT^AKVRRERMGHIEl^APVTHIWYFKGVPSRLGYLLDLAPKDLEKIIYFAAYVITS 
VDEEMRHNELSTLEAEMAVERKAVEDQRDGELEARAQKLEADLAELEAEGAKADARRKVRDGGER 
EMRQIRDRAQRELDRLEDIWSTFTKLAPKQLIVDENLYRELVDRYGEYFTGAMGAESIQKLIENFDIDA 
EAESLRDVIRNGKGQKKLRALKRLKWAAFQQSGNSPMGMVLDAVPVIPPELRPMVQLDGGRFATS 
DLNDLYRRVINRNNRLKRLIDLGAPEIIVNNEKRMLQESVDALFDNGRRGRPVTGPGNRPLKSLSDLL 

20 KGKQGRFRQNLLGKRVDYSGRSVIWGPQLKLHQCGLPKLMALELFKPFVMKRLVDLNHAQNIKSAK 
RMVERQRPQVWDVLEEVIAEHPVLLNRAPTLHRLGIQAFEPMLVEGKAIQLHPLVCEAFNADFDGDQ 
MAVHLPLSAEAQAEARILMLSSNNILSPASGRPLAMPRLDMVTGLYYLTTEVPGDTGEYQPASGDHP 
ETGWSSPAEAIMAADRGVLSVRAKIKVRLTQLRPPVEIEAELFGHSGWQPGDAWMAETTLGRVMF 
NELLPLGYPFVNKQMHKKVC^IINDLAERYPMIWAQTVDKLKDAGFYWATRSGVWSMADVLVPP 

25 RKKEILDHYEERADKVEKQFQRGALNHDERNEALVEIWKEATDEVGQALREHYPDDNPIITIVDSGAT 
GNFTQTRTLAGMKGLVTNPKGEFIPRPVKSSFREGLTVLEYFINTHGARKGLADTALRTADSGYLTRR 
LVDVSQDVIVREHDCQTERGIWELAERAPDGTLIRDPYIETSAYARTLGTDAVDEAGNVIVERGQDL 
GDPEIDALLAAGITQVKVRSVLTCATSTGVCATCYGRSMATGKLVDIGEAVGIVAAQSIGEPGTQLTM 
RTFHQGGVGEDITGGLPRVQELFEARVPRGKAPIADVTGRVRLEDGERFYKITIVPDDGGEEWYDKI 

30 SKRQRLRVFKHEDGSERVLSDGDHVEVGQQLMEGSADPHEVLRVQGPREVQIHLVREVQEVYRAQ 
GVSIHDKHIEVIVRQMLRRVTIIDSGSTEFLPGSLIDRAEFEAENRRWAEGGEPAAGRPVLMGITKAS 
U\TDSV\n_SAASFQETTRVLTDAAINCRSDKLNGLKENVIIGKLIPAGTGINRYRNIAVQPTEEARAAAYT 

I PSYEDQYYSPDFGAATGAAVPLDDYGYSDYR 

35 >Rv0711 atsATB.seq 806333:808693 MW.86216 SEQ ID NO:178 

MAPEATEAFNGTIELDIRDSEPDWGPYAAPVAPEHSPNILYLVWDDVGIATWDCFGGLVEMPAMTRV 

AERGVRLSQFHTTALCSPTRASLLTGRNAT7VGMATIEEFTDGFPNCNGRIPADTALLPEVLAEHGYN 
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TYCVGKWHLTPLEESNMASTKRHWPTSRGFERFYGFLGGETDQWYPDLVYDNHPVSPPGTPEGG 

yhlskdiadktiefirdakviapdkpwfswcpgaghaphhvfkewadryagrfdmgyeryreivle 

RQKALG.VPPDTELSP.NPYLDVPGPNGETWPLQDWRPWDSLSDEEKKLFCRMAEVFAGFLSYTDA 
Q.GR.LDYLEESGQLDNTIIW.SDNGASGEGGPNGSVNEGKFFNGY1DTVAESMKLFDHLGGPQTYN 
5 HYPIGWAMAFNTPYKLFKRYASHEGGIADPAIISWPNGIAAHGEIRDNYVNVSDITPTVfDLLGM^ 
* GTVKGIPQKPMDGVSF.AALADPAADTGKTTQFYTMLGTRGIWHEGWFANTIHAATPAGWSNFNAD 
RWELFHIAADRSQCHDU^AEHPDKLEELXALWFSEAAKYNGLPLADLNLLETMTRSRPYLVSERASY 
N^PDCADVGIGAA^IRGRSFA^DVTIDTTGAEGVLFKHGGAHGGHVLFVRDGRLHYVYNFLGE 
RQQLVSSSGPVPSGRHLLGVRYLRTGTVPNSHTPVGDLELFFDENLVGALTNVLTHPGTFGLAGAAJ 
10 SVGRNGGSAVSSHYEAPFAFTGGTITQVTVDVSGRPFEDVESDLALAFSRD 

>Rv0764c - lanostero. 14-demethylase cytochrome P450 TB.seq 856683:858035 MW:50879 
MSAVA^ 

GDDDLDQAKAYPFMTPIFGEGWFDASPERRKEMLHNAALRGEQMKGHAATIEDQVRRMIADWGE 
AGE.DLLDFFAELTIYTSSACL.GKKFRDQLDGRFAKLYHELERGTDPLAYVDPYLPIESFRRRDEARN 
GLVALVADIMNGRIANPPTDKSDRDMLDVLIAVKAETGTPRFSADEITGMFISMMFAGHHTSSGTASW 
TLIELMRHRDAYAAV.DELDELYGDGRSVSFHALRQIPQLENVLKETLRLHPPLIILMRVAKGEFEVQG 
HRIHEGDLVAASPAISNR.PEDFPDPHDFVPARYEQPRQEDLLNRWTWIPFGAGRHRCVGAAFA1MQ. 

KAIFSVLLREYEFEMAQPPESYRNDHSKMWQLAQPACVRYRRRTGV 



15 



20 

>Rv0861c - DNA helicase TB.seq 958524:960149 MW:59773 SEQ ID NO:180 
VQSDKTN^LEVDHEl^GAARAAIAPFAELERAPEHVHTYRITPl^LWNARAAGHDAEQWDALVSYS 

RYAVPQPLLVDIVDTMARYGRLQLVKNPAHGLTLVSLDRAVLEEVLRNKKIAPMLGARIDDDTVWHP 

25 SERGRVKQLLLKIGWPAEDLAGY^GEAHPISLHQEGWQLRDYQRLAADSFWAGGSGVWLPCGA 

^ GKTLVGAAAMAKAGATTLILVTNIVAARQWKRELVARTSLTENEIGEFSGERKEIRPVTISTYQMITRR 

TKGEYRHLELFDSRDWGLIIYDEVHLLPAPVFRMTADLQSKRRLGLTATLIREDGREGDVFSLIGPKR 

YDAPWKDIEAQGWIAPAECVEVRVTMTDSERMMYATAEPEERYRICSTVHTKIAWKSILAKHPDEQ 

TLVIGAYLDQLDELGAELGAPVIQGSTRTSEREALFDAFRRGEVATLWSKVANFSIDLPEAAVAVQVS 

30 GTFGSRQEEAQRLGRILRPKADGGGAIFYSWARDSLDAEYAAHRQRFLAEQGYGYIIRDADDLLGP 

Al 

>Rv0904c accD3 TB.seq 1006694:1008178 MW:51741 SEQ ID NO:181 

VSRITTDQLRHAVLDRGSFVSWDSEPLAVPVADSYARELAAARAATGADESVQTGEGRVFGRRVAV 
35 VACEFDFLGGSIGVAAAERITAAVERATAERLPLLASPSSGGTRMQEGTVAFLQMVKIAAAIQLHNQA 
~ RLPYLVYLRHPTTGGVFASWGSLGHLTVAEPGALIGFLGPRVYELLYGDPFPSGVQTAENLRRHGIID 

GWALDRLRPMLDRALTVL.DAPEPLPAPQTPAPVPDVPTWDSWASRRPDRPGVRQLLRHGATDR 
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VLLSGTDQGEAATT11ALARFGGQPTV\A.GQQRAVGGGGSWGPAALREARRGMAL^AELCLPLVL 

VIDAAGPALSAAAEQGGLAGQIAHCLAELVTLDTPWSILLGQGSGGPALAMLPADRV^^ 

LPPEGASAIVFRDTAHAAELAAAQGIRSADLLKSGIVDTIVPEYPDAADEPIEFALRLSNA1AAEVHALR 

Kl PAPERLATRLQRYRRIGLPRD 

>Rv0983 - TB.seq 1099064:1100455 MW:46454 SEQ ID NO:182 

MAKLARWGLVQEEQPSDMTNHPRYSPPPQQPGTPGYAQGQQQTYSQQFDWRYPPSPPPQPTQY 

RQPYEALGGTRPGLIPGVIPTMTPPPGMVRQRPRAGMLAIGAVTIAWSAGIGGAAASLVGFNRAPA 

GPSGGPVAASAAPSIPAANMPPGSVEQVAAKWPSWMLETDLGRQSEEGSGIILSAEGLILTNNHVI 

AAAAKPPLGSPPPKTTV/TFSDGRTAPFTWGADPTSDIAWRVQGVSGLTPISLGSSSDLRVGQPVLA 

IGSPLGLEGTVTTGIVSALNRPVSTTGEAGNQNTVLDAIQTDAAINPGNSGGALVNMNAQLVGVNSAI 

ATLGADSADAQSGS1GLGFAIPVDQAKRIADELISTGKASHASLGVQVTNDKDTLGAKIVEWAGGAA 

ANAGVPKGNA/VTKVDDRPINSADALVAAVRSKAPGATVALTFQDPSGGSRTVQVTLGKAEQ 

>Rv1008 - Similar to E.coli protein YcfH TB.seq 1 127087:1 127878 MW:29066 SEQ ID NO:183 
LVDAHTHLDACGARDADWRSLVERAAAAGVTAWWADDLESARWVTTRAAEWDRRWAAVALHPT 

RADALTDAARAELERLVAHPRWAVGETGIDMYVVPGRLDGCAEPHVQREAFAWHIDLAKRTGKPLM 

IHNRQADRDVLDVLRAEGAPDWILHCFSSDAAMARTCVDAGWLLSLSG7VSFRTARELREAVPLMP 

VEQLLVETDAPYLTPHPHRGLANEPYCLPYTVRAI^LVNRRPEEVALITTSNARRAYGLGVVMRQ 



>Rv1 009 - lipoprotein, similar to various other MTB proteins TB.seq 1 1 28089: 1 1 291 74 MW.38079 
SEQ ID NO: 184 

MLRLWGALLLVU^FAGGYAVAACKTVTLTVDGTAMRVTTMKSRVIDIVEENGFSVDDRDDLYPAA^ 

VQVHDADTIVLRRSRPLQISLDGHDAKQVVvTrASWDEALAQLAMTDTAPAAASRASRVPLSGMALP 

WSAKWQLNDGGLVRTVHLPAPNVAGLLSAAGVPLLQSDHWPAATAPIVEGMQIQVTRNRIKKVTE 

RLPLPPNARRVEDPEMNMSREWEDPGVPGTQDVTFAVAEVNGVETGRLPVANNAyvTPAHEAWR 

VGTKPGTEVPPVIDGSIWDAIAGCEAGGNWAINTGNGYYGGVQFDQGTWEANGGLRYAPRADLAT 

REEQIAVAEVTRLRQGWGAWPVCAARAGAR 

>Rv1010 ksgA 16S rRNA dimethyltransferase TB.seq 1129150:1130100 MW:34647 
SEQ ID NO:185 

MCCTSGCALTIRLLGRTEIRRLAKELDFRPRKSLGQNFVHDANTVRRWAASGVSRSDLVLEVGPGL 
GSLTIAU_DRGATVTAVEIDPLLASRLQQWAEHSHSEV^ 

PYNVAVPALLHLLVEFPSIR\AnrVMVQAEVAERU^AEPGSKEYGVPS\A<LRFFGRVRRCGMVSPTVF 

WPIPRVYSGLVRIDRYETSPWPTDDAFRRRVFELVDIAFAQRRKTSRNAFVQWAGSGSESANRLLAA 

SIDPARRGETLSIDDFVRLLRRSGGSDEATSTGRDARAPDISGHASAS 
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>Rv1011 - Similarto E.coli protein YcbH TB.seq 1130189:1131106 MW:31350 
SEQ ID NO: 186 

VPTGSVWRVPGKVNLYLAVGDRREDGYHELTWFHAVSLVDEVTVRNADVLSLELVGEGADQLPTD 
ERNLAWO^ELMAEHVGRAPDVSIMIDKSIPVAGGMAGGSADAAAVLVAMNSLVVELNVPRRDLRML 
AARLGSDVPFALHGGTALGTGRGEELATVLSRNTFHWVLAFADSGLLTSAVYNELDRLREVGDPPRL 
GEPGPVLAALAAGDPDQLAPLLGNEMQAAAVSLDPALARALRAGVEAGALAGIVSGSGPTCAFLCTS 

ASSAIDVGAQLSGAGVCRTVRVATGPVPGARWSAPTEV 

>Rv1106c - cholesterol dehydrogenase TB.seq 1232845:1233954 MW.40743 SEQ ID NO:187 
MLRRMGDASLTTELGRVLVTGGAGFVGANLVTTLLDRGHWVRSFDRAPSU-PAHPQLEVLQGDITD 

ADVCAAAVDGIDTIFHTAAIIELMGGASVTDEYRQRSFAVNVGGTENLLHAGQRAGVQRFVYTSSNS 

WMGGQNIAGGDETl-PYTDRFNDLYTETKWAERFVl-AQNGVDGMLTCAIRPSGIWGNGDQTMFRK 

LFESVLKGHVKVLVGRKSARLDNSYVHNLIHGFILAAAHLVPDGTAPGQAYFINDAEPINMFEFARPVL 

EACGQRWPKMRISGPAVRWVMTGWQRLHFRFGFPAPLLEPLAVERLYLDNYFSIAKARRDLGYEPL 

FTTQQALTECLPYYVSLFEQMKN EARAEKTAATVKP 



>Rv1 1 10 lytB2 TB.seq 1236183:1237187 MW:36298 SEQ ID NO:188 

MVPWDMGIPGASVSSRSVADRPNRKRVLLAEPRGYCAGVDRAVETVERALQKHGPPVYVRHEIVH 

NRHWDTLAKAGAVFVEETEQVPEGAIWFSAHGVAPTVHVSASERNLQVIDATCPLVTKVHNEARR 

FARDDYDILLIGHEGHEEWGTAGEAPDHVQLVDGVDAVDQVWRDEDKWWLSQTTLSVDETMEIV 

GRLRRRFPKLQDPPSDDICYATQNRQVAVKAMAPECELVIWGSRNSSNSVRLVEVALGAGARAAH 

LVDWADDIDSAWLDG\mVG\n-SGASVPE^VRGVl.ERLAECGYDIVQPVTTANETLVFALPRELRS 

PR 

>Rv1216c - TB.seq 1359473:1360144 MW:24863 SEQ ID NO:189 

MHIGLKIFIWGVLGLWFGALLFGPAGTFDYWQANWFl-AAFVSTTIGPTIYLARNDPAALQRRMRSGP 
LAEGRTIQKFIVIGAFLGFFAMM\A.SACDHRYGWSSVPAAVCVIGD\^VMTGLGIAMLWIQNRYAAS 
WRVEAGQILASDGLYKIVRHPMYAGNWMMTGIPLALGSYWAMFILVPGTLVLVFRILDEEKLLTQEL 

SGYREYRQLVRYRLVPYVW 

>Rv1223 htrA TB.seq 1365810:1367456 MW:56547 SEQ ID NO:190 

VSHLSQRMAGLLRVHGEWSRSVDTRVDTDNAMPARFSAQIQNEDEVTSDQGNNGGPNGGGRLAP 

RPVFRPPVDPASRQAFGRPSGVQGSFVAERVRPQKYQDQSDFTPNDQLADPVLQEAFGRPFAGAE 

SLQRHPIDAGALAAEKDGAGPDEPDDPWRDPAAAAALGTPALAAPAPHGALAGSGKLGVRDVLFGG 

KVSYLALGILVAIALVIGGIGGVIGRKTAEWDAFTTSKVTLSTTGNAQEPAGRFTKVAAAVADSVVTIE 

SVSDQEGMQGSGVIVDGRGYIVTNNHVISEAANNPSQFKTTWFNDGKEVPANLVGRDPKTDLAVLK 

VDNVDNLTVARLGDSSKVRVGDEVLAVG APLGLRSTVTQG IVS ALH RPVPLSGEGSDTDTVI DAI QTD 
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AS1NHGNSGGPLIDMDAQV1GINTAGKSLSDSASGLGFAIPVNEMKLVANSLIKDGK.VHPTLG1STRSV 
SNAIASGAQVANVKAGSPAQKGGILENDVIVKVGNRAVADSDEFWAVRQLAIGQDAPIEWREGRH 

VTLTVKPDPDST 

5 >Rv1224 - TB.seq 1367461:1367853 MW:14083 SEQ ID NO:191 

VFANIGWWEMLVLVNWGLVVLGPERLPGAIRWAASALRC3ARDYLSGVTSQLREDIGPEFDDLRGHL 

GELQKLRGMTPRAALTKHLLDGDDSLFTGDFDRPTPKKPDAAGSAGPDATEQ1GAGPIPFDSDAT 

>Rv1229c mrp similar to MRP/NBP35 ATP-binding proteins TB.seq 1371778:1372947 MW:41064 
10 SEQ ID NO:192 

MPSRLHSAVMSGTRDGDLNAAIRTALGKVIDPELRRPITELGMVKSIDTGPDGSVHVEIYLTIAGCPKK 
SEITERVTRAVADVPGTSAVRVSLDVMSDEQRTELRKQLRGDTREPVIPFAQPDSLTRVYAVASGKG 
GVGKSTVIVNLAAAMAN/KGLSIGVIDADIHGHSIPRMMGTTDRPTQVES 
GNTPWWRGPMLHRALQQFLADWWGDLDV^^ 

VAERAGSIALQTRQRIVGWENMSGLTLPDGTTMQVFGEGGGRLVAERLSRAVGADVPLLGQIPLDP 
ALVAAGDSGVPLVLSSPDSAIGKELHSIADGLSTRRRGLAGMSLGLDPTRR 
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>Rv1239c corA magnesium and cobalt transport protein TB.seq 1381943:1383040 MW:41470 
SEQ ID NO: 193 

VFPGFDALPEVLRPVARPQPPNAHPVAQPPAQALVDCGVYVCGQRLPGKYTYAAALREVREIELTG 
QEAFVWIGLHEPDENQMQDVADVFGLHPLAVEDAVHAHQRPKLERYDETLFLVLKTVNYVPHESW 
LAREIVKTGEIMIFVGKDFX/VTVRHGEHGGLSEVRKRMDADPEHLRLGPYAVMHAIADYWDHYLEVT 
NLMETDIDSIEEVAFAPGRKLDIEPIYLLKREWELRRCVNPLSTAFQRMQTESKDLISKEVRRYLRDV 
ADHQTEAADQIASYDDMLNSLVCW^LARVGMQQNMDMRKISAWAGIIAVPTMIAGIYGMNFHFMPEL 

DSRWGYPTVIGGMVLICLFLYHVFRNRNWL 

>Rv1279 -TB.seq 1430060:1431643 MW:57332 SEQ ID NO:194 
MDTQSDYVWGTGSAGAWASRLSTDPATTWALEAGPRDKNRFIGVPAAFSKLFRSEIDWDYLTEP 

QPELDGREIYWPRGKVLGGSSSMNAMMVVVRGFASDYbEWAARAGPRWSYADVLGYFRRIENVTA 

AWHFVSGDDSGVTGPLHISRQRSPRSN/TAAWI^AARECGFAAARPNSPRPEGFCETVVTQRRGAR 

30 FSTADAYLKPAMRRKNLRVLTGATATRWIDGDRAVGVEYQSDGQTRIVYARREWLCAGAVNSPQL 

LMLSGIGDRDHU^HDIDTVYHAPEVGCNLLDHLVTVLGFDVEKDSLFAAEKPGQLiSYLLRRRGMLT 

SNVGEAYGFVRSRPELKLPDLELIFAPAPFYDEALVPPAGHGWFGPILVAPQSRGQITLRSADPHAK 

PVIEPRYLSDLGGVDRAAM^^GLRICARIAQARPLRDLLGSIARPRNSTELDEATLELAI_ATCSHTLYH 

PMGTCRMGSDEASVVDPQLRVRGVDGLRVADASVMPSTVRGHTHAPSVLIGEKAADLIRS 

>Rv1294 thrA homoserine dehydrogenase TB.seq 1449373:1450695 MW.45522 SEQ ID NO:195 
VPGDEKPVGVAVLGLGNVGSEWRIIENSAEDLAARVGAPLVLRGIGVRRVTTDRGVPIELLTDDIEEL 

VAREDVDIWEVMGPVEPSRKAILGALERGKSVVTANKALLATSTGELAQAAESAHVDLYFEAAVAGA 
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IPVIRPLTQSLAGDTVLRVAGIVNGTTNYILSAMDSTGADYASALADASALGYAEADPTADVEGYDAA 
AKAAILASIAFHTRVTADDWREGITKVn-PADFGSAHALGCTIKLLSICERITTDEGSQRVSARWPALV 
PLSHPLAAVNGAFNANAA/EAEAAGRLMFYGQGAGGAPTASAVTGDLVMAARNRVLGSRGPRESKY 
AQLPVAPMGFIETRYWSMNVADKPGVLSAVAAEFAKREVSIAEVRQEGWDEGGRRVGARIWVTH 

LATDAALSETVDALDDLDWQGVSSV1 RLEGTGL 

>Rv1323 fadM acetyl-CoA C^cetyltransferase (aka thiL) TB.seq 1485860:1487026 MW:40049 
SEQ ID NO:196 

VIVAGARTPIGKLMGSLKDFSASELGAIAIKGALEKANVPASLVEYVIMGQVLTAGAGQMPARQAAVA 

AGIGWDVPALTINK^4CLSGIDAIALADQLIRAREFDWVAGGQESMT1<APHLLMNSRSGYKYGDVTVL 

DHMAYDGLHDVFTDQPMGALTEQRNDVDMFTRSEQDEYAAASHQKAAAAVVKDGVFADEVIPVNIP 

QRTGDPLQFTEDEGIRANTTAAALAGLKPAFRGDGTITAGSASQISDGAAAWVMNQEKAQELGLTW 

LAEIGAHGWAGPDSTLQSQPANAINKALDREGISVDQLDWEINEAFAAVALASIRELGLNPQIVNVN 

GGAIAVGHPLGMSGTRITLHAALQLARRGSGVGVAALCGAGGQGDALILRAG 

>Rv1389 gmk putative guanylate kinase TB.seq 1564399:1565022 MW:22064 SEQ ID NO:197 
VSVGEGPDTKPTARGQPAAVGRNA/VLSGPSAVGKSTWRCLRERIPNLHFSVSATTRAPRPGEVDG 
VDYHFIDPTRFQQLIDQGELLEWAEIHGGLHRSGTLAQPVHAAAATGVPVLIEVDLAGARAIKKTMPE 
ANHVFIJ^PSWQDLQARLIGRGTETADvlQRRLDTARIELAAQGDFDKWVNRRLESACAELVSIXVG 

TAPGSP 

>Rv1407 fmu similarto Fmu protein TB.seq 1583099:1584469 MW:48494 SEQ ID NO:198 
MTPRSRGPRRRPLDPARRAAFETLRAVSARDAYANLVLPALLAQRGIGGRDAAFATELTYGTCRAR 

GLLDAVIGAAAERSPQAIDPVLLDLLRLGTYQLLRTRVDAHAAVSTTVEQAGIEFDSARAGFVNGVLR 

TIAGRDERSWVGEUVPDAQNDPIGHAAFVHAHPRW1AQAFADALGAAVGELEAVLASDDERPAVHLA 

ARPGvT-TAGEU^RAVRGTVGRYSPFAVYLPRGDPGRLAPVRDGQALVQDEGSQLVARALTLAPVDG 

DTGRWLDLCAGPGGKTALLAGLGLQCAARvTAVEPSPHRADLVAQNTRGLPVELLRVDGRHTDLDP 

GFDRV^VDAPCTGLGALRRRPEARWRRQPADVAALAKLQREU.SAAIALTRPGGWLYATCSPHLAE 

TVGAVADALRRHPVHALDTRPLFEPVIAGLGEGPHVQLWPHRHGTDAMFAAALRRLT 

>RV1409 ribG riboflavin biosynthesis TB.seq 1585192:1586208 MW:35367 SEQ ID NO:199 

MNVEQVKSIDEAMGLAIEHSYQVKGTTYPKPPVGAVIVDPNGRIVGAGGTEPAGGDHAEWALRRAG 

GU^GAIVWTMEPCNHYGKTPPCVNALIEARVGTVWAVADPNGIAGGGAGRLSAAGLQVKSGvT-A 

EQVAAGPLREWLHKQRTGLPHVTWKYATSIDGRSAAADGSSQWISSEAARLDLHRRRAIADAILVGT 

GTVLADDPALTARLADGSLAPQQPLRWVGKRDIPPEARVLNDEARTMMIRTHEPMEVLRALSDRTD 

VLl.EGGPTLAGAFLRAGAINRILAWAPILLGGP\n"AVDDVGVSNITNALRWQFDSVEKVGPDLLLSLv* 

AR 
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>R V 1440 secGTB.seq 1617715:1618065 MW:12140 SEQ ID NO:200 

VAG\n-AAVSARLKADEARRPGFYAAGSGPLPQVRGSTLPVMELALQITL.Nn-SWVVU_VU.HRAKGG 
GLSTLFGGGVQSSLSGSTWEKNLDRLTLFVTGIWLVSIIGVALLIKYR 

>Rv1484 lnhATB.seq 1674200:1675006 MW:28529 SEQ ID NO:201 

MTGLLDGKRILVSGIITDSSIAFHIARVAQEQGAQLVLTGFDRLRLIQRITDRLPAKAPLLELDVQNEEH 
LASLAGRVTEAIGAGNKLDGWHSIGFMPQTGMGINPFFDAPYADVSKGIHISAYSYASMAKALLPIM 
NPGGSIVGMDFDPSRAMPAYNWMWAKSALESVNRFVAREAGKYGVRSNLVAAGPIRTLAMSAIVG 
GALGEEAGAQIQLLEEGWDQRAPIGWNMKDATPVAKTVCALLSDVVLPATTGDIIYADGGAHTQLL 



10 



>RV1617 pykA pyruvate kinase TB.seq 1816187:1817602 MW:50668 SEQ ID NO:202 

VTORGKIVCTLGPATQRDDLVRALVEAGMDVARMNFSHGDYDDHKVAYERVRVASDATGRAVGVL 
ADLQGPKIRLGRFASGATHWAEGETVRIWGACEGSHDRVSTTYKRLAQDAVAGDRVLVDDGKVAL 
vVDAVEGDDWCTvVEGGPVSDNKGISLPGM^IVTAPALSEKDIEDLTFALNLGVD^4VALSFVRSPAD 

15 VELVHEVMDRIGRRVPVIAKLEKPEAIDNLEAIVLAFDAVMVA^ 

NAKPVIVATQMLDSMIENSRPTRAEASDVANAVLDGADALMLSGETSVGKYPLAAVRTMSRIICAVEE 

NSTAAPPLTHIPRTKRGVISYAARDIGERLDAKALVAFTQSGDTvKRLARLHTPLPLLAFTAWPEVRS 

QLAMTWGTETFIVPKMQSTDGMIRQVDKSLLEI^RYKRGDLWIVAGAPPGTVGSTNLIHVHRIGEDD 

V 



20 



25 



30 
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>RV1630 rpsA 30S ribosomal protein S1 TB.seq 1833540:1834982 MW:53203 SEQ ID NO:203 
MPSPTVTSPQVAVNDIGSSEDFLAAlDKTIKYFNDGDIv^GTIN^DRDEVlXDIGYKTEGVIPARELSIK 

HDVDPNEWSVGDEVEALVT-TKEDKEGRULSKKRAQYERAWGTlEALXEKDEAVKGTVIEVvXGGLI 

LDIGLRGFLPASLVEMRRVRDLQPYIGKEIEAKIIELDKNRNNWLSRRAWLEQTQSEVRSEFLNNLQK 

GTIRKGWSSIVNFGAFVDLGGVDGLWVSELSWKHIDHPSEWQVGDEVTVEVLDVDMDRERVSLS 

LKATQEDPWRHFARTHAIGQIvPGKvTKLVPFGAFVRvHEGIEGLVHISELAERHVEVPDQWAVGDD 

AMVKVIDIDLERRRISLSLKQANEDYTEEFDPAKYGMADSYDEQGNYIFPEGFDAETNEWLEGFEKQ 

RAEWEARYAEAERRHKMHTAQMEKFAAAEAAGRGADDQSSASSAPSEKTAGGSLASDAQLAALRE 

KLAGSA 

>Rv1631 -TB.seq 1835011:1836231 MW:44669 SEQ ID NO.204 

MLRIGLTGGIGAGKSLLSTTFSQCGGIWDGDVLAREWQPGTEGLASLVDAFGRDILLADGALDRQA 

UVAKAFRDDESRGVLNGIVHPLVARRRSEIIAAVSGDAVVVEDIPLLVESGMAPLFPLVv>A/HAD 

VRRLVEQRGMAEADARARIAAQASDQQRRAVADVWLDNSGSPEDLVRRARDVWNTRVQPFAHNL 

AQRQIARAPARLVPADPSWPDQARRIVNRLKIACGHKALRVDHIGSTAVSGFPDFLAKDVIDIQVTVE 

SLDVADELAEPLLAAGYPRLEHITQDTEKTDARSTVGRYDHTDSAALWHKRVHASADPGRPTNVHLR 
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VHGWPNQQFALLFVDWLAANPGAREDYLTVKCDADRRADGELARYVTAKEPWFLDAYQRAWEVVA 
DAVHWRP 

>Rv1706c - TB.seq 1932695:1933876 MW:39779 SEQ ID NO:205 

MTLDVPVNQGHVPPGSVACCLVGVTAVADGIAGHSLSNFGALPPEINSGRMYSGPGSGPLMAAAAA 
WDGLAAELSSAATGYGAAISELTNMRWWSGPASDSMVAAVLPFVGWLST^ 

AAFEAAFAMWPPPAIAANRTLLMTLVDTNWFGQNTPAIATTESQYAEMWAQDAAAMYGYASAAAP 
ATVLTPFAPPPQTTNATGLVGHATAVAALRGQHSWAAAIPWSDIQKYV^MFLGALATAEGFIYDSG 
GLTLNALQFVGGMLWSTAIJ^EAGAAEAAAGAGGAAGWSAWSQLGAGPVAASATLAAKIGPMSVPP 
GWSAPPA7PQAQTVARSIPGIRSAAEAAETSVLLRGAPTPGRSRAAHMGRRYGRRLTVMADRPNVG 

>Rv1745c - similar to Q46822 ORF.0182 TB.seq 1971381:1971989 MW:22490 SEQ ID NO:206 
MTRSYRPAPPIERWU-NDRGDATGVADKATVHTGDTPLHLAFSSWFDLHDQLLITRRAATKRTWP 
AVVyTTNSCCGHPLPGESLPGAIRRRLAAELGLTPDRVDLILPGFRYRAAMADGTVENEICPVYRVQVD 
QQPRPNSDEVDAIRWLSWEQFVRDVTAGVIAPVSPWCRSQLGYLTKLGPCPAQWPVADDCRLPKA 

AHGN 



>Rv1800 -TB.seq 2039451:204141 5 MW:67068 SEQ ID NO:207 

MLPNFAVLPPEVNSARVFAGAGSAPMLAAAAAWDDLASELHCAAMSFGSVTSGLWGWWQGSASA 
20 AMVDAAASYIGWLSTSAAHAEGAAGl^RAAVSWEEALAAWHPAMVAANRAQVASLVASNLFGQN 
APAIAALESLYECMWAQDAAAMAGYYVGASAVATQLASVVLQRLQSIPGAASLDARLPSSAEAPMGV 
VPJWNSAIAANAAAAQWGLVMGGSGTPIPSARYVELANALYMSGSVPGVIAQALFTPQGLYPWVIK 
NLTFDSSVAQGAVILESAIRQQIAAGNNVTVFGYSQSATISSLVMANLAASADPPSPDELSFTLIGNPN 
NPNGGVATRFPGISFPSLGVTATGATPHNLYPTKIYTIEYDGVADFPRYPLNFVSTLNAIAGTYYVHSN 
25 YFILTPEQIDAAVPLTNTVGPTMTQYYIIRTENLPLLEPLRSVPIVGNPLANLVQPNLKVIVNLGYGDPA 
YGYSTSPPNVATPFGLFPEVSPWIADALVAGTQQGIGDFAYDVSHLELPLPADGSTMPSTAPGSGT 
PVPPLSIDSLIDDLQVANRNI^NTriSKVAATSYAWLPTADIANAALTIVPSYNIHLFLEGIQQALKGDPM 

GLVNAVGYPLAADVALFTAAGGLQLLIIISAGRTIANDISAIVP 

30 >Rv1 844c gnd 6-phosphogluconate dehydrogenase (Gram -) TB.seq 2093732:20951 86 
MW:51548 SEQ ID NO:208 

MSSSESPAGIAQIGVTGLAVMGSNIARNFARHGYTVAVHNRSVAKTDALLKEHSSDGKFVRSETIPEF 
LAALEKPRRVLIMVKAGEATDADAVINELADAMEPGDIIIDGGNALYTDTMRREKAMRERGLHFVGAG 
ISGGEEGALNGPSIMPGGPAESYQSLGPLLEEISAHVDGVPCCTHIGPDGSGHFVKMVHNGIEYSDM 
35 QUGEAYQLMRDGLGLTAPAJADVFTEWNNGDLDSYLVEITAEVLRQTDAKTGKPLVDVIVDRAEQKG 
TGRWTVKSALDLGVPVTGIAEAVFARALSGSVGQRSAASGLASGKLGEQPADPATFTEDVRQALYA 
SKIVAYAQGFNQIQAGSAEFGWDITPGDLATIWRGGCIIRAKFLNHIKEAFDASPNLASLIVAPYFRGA 
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VESAIDSWRRWSTAAQLGIPTPGFSSALSYYDALRTARLPAALTQAQRDFFGAHTYGRIDEPGKFHT 
LWSSDRTEVPV 

>Rv1900c lipJ TB.seq 2146246:2147631 MW:49685 SEQ ID NO:209 

VAQAPHIHRTRYAKCGDMDIAYQVLGDGPTDLLVLPGPFVPIDSIDDEPSLYRFHRRLASFSRVIRLDH 

RGVGLSSRLAAITTLGPKFWAQDAIAVMDAVGCEQATIFAPSFHAMNGLVLAADYPERVRSLIVVNGS 

ARPLWAPDYPVGAQVRRADPFLWALEPDAVERGFDVLSIVAPTVAGDDVFRAWWDLAGNRAGPP 

SIARAVSKVIAEADVRDVLGHIEAPTLILHRVGSTYIPVGHGRYIAEHIAGSRLVELPGTDTLYW^ 

GPMLDEIEEFITGVRGGADAERMLATIMFTDIVGSTQHAAALGDDRWRDLLDNHDTIVCHEIQRFGGR 

EVNTAGDGFVATFTSPSAAIACADDIVDAVAALGIEVRIGIHAGEVEVRDASHGTDVAGVAVHIGARVC 

ALAGPSEVLVSSTVRD1VAGSRHRFAERGEQELKGVPGRWRLCVLMRDDATRTR 

>Rv1967 - TB seq 2210599:221 1624 MW:36516 SEQ ID NO:210 
MRENLGGWVRLGVFLAVCLLTAFUJAVFGEVRFGDGKTW 

RISINPDAWRVQFTADNS\ni.TRGTRAVlRYDNLFGDRYI_ALEEGAGGI_A\/LRPGHTIPLARTQPALD 
LDALIGGFKPLFRALNPEQVNALSEQLLHAFAGQGPTIGSLI^QSAANn-hTTLADRDRLIGQVITNLNW 
LGSLGAHTDRLDQAVTSLSALIHRLAQRKTDISNAVAYTNAAAGSVADLLSQARAPI-AKWRETDRVA 
GIAAADHDYLDNLLmT-PDKYC^VRQGMYGDFFAFYLCD\AA.KVNGKGGQP\^KLAGQDSGRCA 

PK 

>Rv1975 - TB.seq 2218050:2218712 MW:23650 SEQ ID NO:21 1 

MSRRASATCALSATTAVAIMAAPAARADDKRLNDGWANVYTVQRQAGCT>IDVTINPQLQLAAQ 

TLDLLNNRHLNDDTGSDGSTPQDRAHAAGFRGKVAETVAINPAVAISGIELINQVVYYNPAFFAIMSDC 

ANTQIGVWSENSPDRTVWAVYGQPDRPSAMPPRGAVTGPPSPVAAQENVPIDPSPDYDASDEIEY 

GINWLPWILRGVYPPPAMPPQ 

>Rv1981c nrdF ribonucleotide reductase small subunrt TB.seq 2224221:2225186 MW:36591 
SEQ ID NO:212 

Vn"GKLVERVHAINWNRLLDAKDLQ\/WERLTGNFWLPEKIPLSNDI^SWQTLSSTEQQTTIRVFTGLT 

LLDTAQATVGAVAMIDDAVTPHEEAVLTNMAFMESVHAKSYSSIFSTLCSTKQIDDAFDWSEQNPYL 

QRKAQIIVDYYRGDDALKRKASSVMLESFLFYSGFYLPMYWSSRGKLTNTADLIRLIIRDEAVHGYYIG 

YKCQRGIJ^LTDAERADHREYTCELLrfrLYANEIDYAHDLYDELGWTDDVLPYMRYNANKALANLG 

YQPAFDRDTCQVNPAVRAALDPGAGENHDFFSGSGSSYVMGTHQPTTDTDWDF 



>Rv2092c helY helicase. Ski2 subfamily TB.seq 2349335:2352052 MW:99576 SEQ ID NO:21 3 

WEI^ELDRFTAELPFSLDDFQQRACSALERGHGN^VCAPTGAGKTWGEFAVHLAU^GSKCFYTT 

PLKALSNQKHTDLTARYGRDQIGLLTGDLSVNGNAPVWMTTEVLRNMLYADSPALQGLSYWMDE 
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VHFLADRMRGPVWEEVILQLPDDVRWSLSATVSNAEEFGGWIQTVRGDTTVWDEHRPVPLWQHV 

LVGKRMFDLFDYRIGEAEGQPQVNRELLRHIAHRREADRMADWQPRRRGSGRPGFYRPPGRPEV1 

AKLDAEGLLPAITFVFSRAGCDAAVn-QCLRSPLRLTSEEERARIAEVIDHRCGDLADSDLAVLGYYEW 

REGLLRGLAAHHAGMLPAFRHWEELFTAGLVKAVFATETLALGINMPARTVVLERLVKFNGE 

LTPGEYTQLTGRAGRRGIDVEGHAWIWHPEIEPSEVAGLASTRTFPLRSSFAPSYNMTINLVHRMGP 

QQAHRLLEQSFAQYQADRSWGLVRGIERGNRILGEIAAELGGSDAPILEYARLRARVSELERAQARA 

SRLQRRQAATDALAALRRGDIITITHGRRGGLAWLESARDRDDPRPLVLTEHRWAGRISSADYSGTT 

PVGSMTLPKRVEHRQPRVRRDLASALRSAAAGLVIPAARRVSEAGGFHDPELESSREQLRRHPVHT 

SPGLEDQIRQAERYLRIERDNAQLERKVAAATNSLARTFDRFVGLLTEREFIDGPATDPWTODGRLL 

ARIYSESDLLVAECLRTGAWEGLKPAELAGWSAWYETRGGDGQGAPFGADVPTPRLRQALTQTS 

RLSTTLRADEQAHRITPSREPDDGFVRV1YRWSRTGDLAAALAAADVNGSGSPLI-AGDFVRWCRQV 

LDLLDQVRNAAPNPELRATAKRAIGDIRRGWAVDAG 

>Rv2101 helZ helicase, Snf2/Rad54 family TB.seq 2360238:2363276 MW:1 1 1632 
SEQ ID NO:214 

MLVLHGFWSNSGGMRLWAEDSDLLVKSPSQALRSARPHPFAAPADLIAGIHPGKPATAVLLLPSLRS 

APLDSPELIRLAPRPAARTDPMUJ^WTVPWDLDPTAALAAFDQPAPDVRYGASVDYLAELAVFAREL 

VERGR\^PQLRRDTHGAAACVVRP\A.QGRDWAMTSLVSAMPPVCRAEVGGHDPHELATSALDAMV 

DAAVRAALSPMDLLPPRRGRSKRHRAVEAWLTALTCPDGRFDAEPDELDALAEALRPVVDDVGIGTV 

GPARATFRLSEVETENEETPAGSLWRLEFLLQSTQDPSLLVPAEQAWNDDGSLRRWLDRPQELLLT 

ELGRASRIFPELVPALRTACPSGLELDADGAYRFLSGTAAVLDEAGFGVLLPSVVWDRRRKLGLVLSA 

YTPVDGWGKASKFGREQLVEFRWELAVGDDPLSEEEIAALTETKSPLIRLRGQWVALDTEQMRRGL 

EFLERKPTGRKTTAEILALAASHPDDVDTPLEVTAVRADGWLGDLLAGAAAASLQPLDPPDGFTATLR 

PYQQRGLAWLAFLSSLGLGSCLADDMGLGKTVQLLALETLESVQRHQDRGVGPTLLLCPMSLVGN 

WPQEAARFAPNLRVYAHHGGARLHGEALRDHLERTDLWSTYTTATRDIDELAEYEWNRWLDEAQ 

AVKNSLSRAAKAVRRLRAAHRVALTGTPMENRLAELWSIMDFLNPGLLGSSERFRTRYAIPIERHGHT 

EPAERLRASTRPYILRRLKTDPAIIDDLPEKIEIKQYCQLTTEQASLYQAWADMMEKIENTEGIERRGN 

VU^AMAKLKQVCNHPAQLLHDRSPVGRRSGKVIRLEEILEEILAEGDRVLCFTQFTEFAELLVPHLAAR 

FGRAARDIAYLHGGTPRKRRDEMVARFQSGDGPPIFLLSLKAGGTGLNLTAANHWHLDRWWNPAV 

ENQATDRAFRIGQRRTVQVRKFICTGTLEEKIDEMIEEKKALADLWTDGEGWLTELSTRDLREVFAL 

SEGAVGE 

>Rv21 1 0C prcB proteasome [beta]-type subunit 2 TB.seq 2369727:2370599 MW:30274 
SEQ ID NO:215 

VTWPLPDRLSINSLSGTPAVDLSSFTDFLRRQAPELLPASISGGAPLAGGDAQLPHGTTIVALKYPGG 
WMAGDRRSTQGNMISGRDVRKVYITDDYTATGIAGTAAVAVEFARLYAVELEHYEKLEGVPLTFAG 
KINRLAIMVRGNUW^MQGLLALPLLAGYDIHASDPQSAGRIVSFDAAGGWNIEEEGYQAVGSGSLFA 
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KSSMKKLYSQVTDGDSGLRVAVEALYDAADDDSATGGPDLVRGIFPTAV1IDADGAVDVPESRIAELA 
RAII ESRSGADTFGSDGGEK 

>Rv2118c- =B2126_C1_165 (83.6%) TB.seq 2377471:2378310 MW:30091 SEQ ID NO:216 
VSATGPFS1GERVQLTDAKGRRYTMSLTPGAEFHTHRGSIAHDAV1GLEQGSWKSSNGALFLVLRPL 
LVDYVMSMPRGPQV1YPKDAAQ1VHEGDIFPGARVLEAGAGSGALTLSLLRAVGPAGQV1SYEQRAD 
HAEHARR^SGCYGQPPDNWRLWSDLADSELPDGSVDP^VLDMLAPVVEVLDAVSRLLVAGGVLM 
VWATVTQLSRIVEALRAKQCV\TTEPRAVVETLQRGWNWGI^VRPQHSMRGI^AFLV 

VAPAPLGRKREGRDG 

>Rv2144c - TB.seq 2404166:2404519 MW:12028 SEQ ID NO:217 
MLIIALVLALIGLIjy-VFAVVTSNQLVAVVVCIGASVLGVALLIVDALRERQQGGADEADGAGETGVAEE 

ADVDYPEEAPEESQAVDAGVIGSEEPSEEASEATEESAVSADRSDDSAK 

15 >Rv2146c - TB.seq 2405667:2405954 MW.10805 SEQ ID NO:218 

LvVFFQILGFALFIFV\^LLIARVWEFIRSFSRDWRPTG\nVVILEIIMSITDPPVKVLRRLIPQLTIGAVRF 

DLSIMVLLLVAFIGMQLAFGAAA 

>Rv2147c-TB.seq 2406119:2406841 MW:27630 SEQ ID NO:219 
20 VNSHCSHTFITDNRSPP^RGHAMSTLHKVKAYFGMAPMEDYDDEYYDDRAPSRGYARPRFDDDY 

GRYDGRDYDDARSDSRGDLRGEPADYPPPGYRGGYADEPRFRPREFDRAEMTRPRFGSWLRNST 

RGALAMDPRRMAMMFEDGHPLSKITTLRPKDYSEARTIGERFRDGSPVIMDLVSMDNADAKRLVDF 

AAGLAFALRGSFDKVATKVFLLSPADVDVSPEERRRIAETGFYAYQ 

25 >Rv2148c- TB.seq 2406841:2407614 MW:27694 SEQ ID NO:220 

MAADLSAYPDRESELTHALAAMRSRLAAAAEAAGRNVGEIELLPITKFFPATDVAILFRLGCRSVGES 
REQEASAKMAELNRLUVAAELGHSGGWWHMVGRIQRNKAGSI^RWAHTAHSVDSSRL\n-ALDRA 
WAALAEHRRGERLRVYVQVSLDGDGSRGGVDSTTPGAVDRICAQVQESEGLELVGLMGIPPLDWD 
-nEAFDRLQSEHNRVRAMFPHAIGLSAGMSNDLEVAVKHGSTCVRVGTALLGPRRLRSP 



30 
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>Rv2150cftsZTB.seq 2408386:2409522 MW:38757 SEQ ID NO:221 

IVH'PPHNYI^VIKWGIGGGGVNAVNRMIEQGLKGVEFIAINTDAQALLMSDADVKLDVGRDSTRGLG 
AGADPEVGRKAAEDAKDEIEELLRGADMVFVTAGEGGGTGTGGAPWASIARKLGALTVGWTRPF 
SFEGKRRSNC^NGIAALRESCDTLIVIPNDRLLQMGDAAVSLMDAFRSADEVLLNGVQGITDLITTP 
GLINVDFADVKGIMSGAGTALMGIGSARGEGRSLKAAEIAINSPLLEASMEGAQGVLMSIAGGSDLGL 
FEINEAASLVQDAAHPDANIIFGTVIDDSLGDEVRVTVIAAGFDVSGPGRKPVMGETGGAHRIESAKA 
GKLTSTLFEPVDAVSVPLHTNGATLSIGGDDDDVDVPPFMRR 
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>Rv2152c murC TB.seq 2410639:2412120 MW:51146 SEQ ID NO:222 

VSTEQLPPDLRRVHMVGIGGAGMSGIARILLDRGGLVSGSDAKESRGVHALRARGALIRIGHDASSL 

DLLPGGATAWTTHAAIPICrNPELVEARRRGIPWLRPAVLAKLMAGRTTLM\n"GTH 

LQHCGLDPSFAVGGELGEAGTNAHHGSGDCFVAEADESDGSLLQYTPHVAVITNIESDHLDFYGSVE 

AYVAVFDSFVERIVPGGALWCTDDPGGAALAQRATELGIRVLRYGSVPGETMAATLVSWQQQGVG 

AVAHIRLASELATAQGPRVMRLSVPGRHMALNALGALLAAVQIGAPADEVLDGLAGFEGVRRRFELV 

GTCGVGKASVRVFDDYAHHPTEISATLAAARMVLEQGDGGRCMWFQPHLYSRTKAFAAEFGRALN 

AADEVFVLDWGAREQPLAGVSGASVAEHXAVPMRYVPDFSAVAQQVAAAASPGDVIVTMGAGDVT 

LLGPEILTALRVRANRSAPGRPGVLG 

>Rv2153c murG TB.seq 2412120:2413349 MW:41829 SEQ ID NO:223 

VKDTVSQPAGGRGATAPRPADAASPSCGSSPSADSVSWLAGGGTAGHVEPAMAVADALVALDPR 

VRn"ALGTLRGLETRLVPQRGYHLELITAVPMPRKPGGDLARLPSRVWRAVREARDVLDDVDADWV 

GFGGYVALPAYLAARGLPLPPRRRRRIPWIHEANARAGLANRVGAHTADRVLSAVPDSGLRRAEW 

GVPVPJ^SIAALDRAV^RAEARAHFGFPDDARVLLWGGSQGAVSLNRAVSGAAADLAAAGVCVLHA 

HGPQNVLELRRRAQGDPPYVAVPYLDRMELAYAAADLVICRAGAMTVAEVSAVGLPAIYVPLPIGNG 

EQRLNALPWNAGGGMWADAALTPELVARQVAGLLTDPARLAAMTAAAARVGHRDAAGQVARAAL 

AVATGAGARTTT 

>Rv2154c ftsW TB.seq 2413349:2414920 MW:56306 SEQ ID NO:224 

VLTRLLRRGTSDTDGSQTRGAEPVEGQRTGPEEASNPGSARPRTRFGAWLGRPMTSFHLIIAVAALL 

TTLGLIMVLSASAVRSYDDDGSAVVVIFGKQVLVVTLVGLIGGYVCLRMSVRFMRRIAFSGFAITIVMLVL 

VLWGIGKEANGSRGWFWAGFSMQPSEUVKMAFAIWGAHLLAARRMERASLREMLIPLVPAAWAL 

ALIVAQPDLGQTVSMGIILLGLLWYAGLPLRVFLSSLAAWVSAAILAVSAGYRSDRVRSVVLNPENDP 

QDSGYQARQAKFALAQGGIFGDGLGQGVAKWNYLPNAHNDFIFAIIGEELGLVGALGLLGLFGLFAY 

TGMRIASRSADPFLRLLTATTTL\AM.GQAFINIGYVIGLLPVTGLQLPLISAGGTSTAATLSLIGIIANAAR 

HEPEAVAALRAGRDDKVNRLLRLPLPEPYLPPRLEAFRDRKRANPQPAQTQPARKTPRTAPGQPAR 

QMGLPPRPGSPRTADPPVRRSVHHGAGQRYAGQRRTRRVRALEGQRYG 

>Rv2155c murD TB.seq 2414935:2416392 MW:49314 SEQ ID NO:225 

X^DPLGPGAPVLVAGGRVTGQAVAAVLTRFGATPTVCDDDPVMLRPHAERGLPTVSSSDAVQQITG 
YALWASPGFSPATPLUWVAAAGVPIWGDVELAWRLDAAGCYGPPRSWLV\n"GTNGKTTTTSMLH 
AMLIAGGRRAVLCGNIGSAVLDVLDEPAELLAVELSSFQLHWAPSLRPEAGAVLNIAEDHLDWHATM 
AEYTAAKARVLTGGVAVAGLDDSRAAALLDGSPAQVRVGFRLGEPAARELGVRDAHLVDRAFSDDL 
TLLPVASIPVPGPVGVLDALAAAALARSVGVPAGAIADAVTSFRVGRHRAEWAVADGITYVDDSKAT 
NPHAARASVLAYPRWWIAGGLLKGASLHAEVAAMASRLVGAVLIGRDRAAVAEALSRHAPDVPWQ 
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WAGEDTGMPATVEVPVACVLDVAKDDKAGETVGAAVMTAAVAAARRMAQPGD7VLLAPAGASFD 
QFTGYADRGEAFATAVRAVIR 

>Rv2156c murX TB.seq 2416397:2417473 MW:37714 SEQ ID NO:226 
5 MRQILIAVAVAVTVSILLTPVLIRLFTKQGFGHQIREDGPPSHHTKRGTPSMGGVA1LAGIWAGYLGAH 
LAGLAFDGEGIGASGLLVLGLATALGGVGFIDDLIKIRRSRNLGLNKTAKTVGQITSAVLFGVLVLQFRN 

AAGLTPGSADLSWREIATVTIAPVLFVLFCWIVSAWSNAVNFTDGLDGLAAGT^ 

WQYRNACVTAPGLGCYNVRDPLDLALIAAATAGACIGFLWWNAAPAKIFMGDTGSLALGGVIAGLSV 

TSRTEILAWLGALFVAEITSVVLQILTFRTTGRRMFRMAPFHHHFELVGWAET7VIIRFVVLLTAITCGL 

10 GVALFYGEWLAAVGA 
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>Rv2157c murF TB.seq 2417473:2419002 MW:51634 SEQ ID NO:227 

MIELTVAQIAEIVGGAVADISPQDAAHRRVTGTVEFDSRAIGPGGLFLALPGARADGHDHAASAVAAG 
/^WLAARPVGVPAIWPPVAAPNVLAGVLEHD 

SGKTSTXDU^AAVLAPLGEWAPPGSFNNELGHPVVWLRATRRTDYLILEMAARHHGNIAALA^ 

SIGVVLNVGTAHLGEFGSREVIAQTKAELPQAVPHSGAWLNADDPAVAAMAKLTAARWRVSRDNT 

GDVWAGPVSLDEU^PRFTLHAHDAC^EVRLGVCGDHQVTNALCAAAVALECGASVEQVAAALTAA 

PPVSRHRMQVTTRGDGVTVIDDAYNANPDSMRAGLQALAWIAHQPEATRRSWAVLGEMAELGEDAI 

AEHDRIGRLAVRLDVSRLWVGTGRSISAMHHGAVLEGAWGSGEATADHGADRTAVNVADGDAALA 

LLRAELRPGDWLVKASNAAGLGAVADALVADDTCGSVRP 
>Rv2158c murE TB.seq 2419002:2420606 MW:55310 SEQ ID NO:228 

VSSLARGISRRRTEVATQVEAAPTGLRPNAWGVRLAALADQVGAALAEGPAQRAWEDRTWGVTL 

RAQDVSPGDLFAALTGSTTHGARHVGDAIARGAVAVLTDPAGVAEIAGRAAVPVLVHPAPRGVLGGL 

AATWGHPSERLTV1GITGTSGKTTTTYLVEAGLRAAGRVAGLIGTIGIRVGGADLPSALTTPEAPTI.QA 

MLAAMVERGVDTVVMEVSSHALALGRVDGTRFAVGAFTNLSRDHLDFHPSMADYFEAKASLFDPDS 

ALRARTAWCIDDDAGRAMAARAADAITVSAADRPAHVVRATDVAPTDAGGQQFTAIDPAGVGHHIGI 

RLPGRYNVANCLVALAILDTVGVSPEQAVPGLREIRVPGRLEQIDRGQGFLALVDYAHKPEALRSVLT 

TU^PDRRLAWFGAGGDRDPGKRAPMGRIAAQLADLVVVTDDNPRDEDPTAIRREILAGAAEVGGD 

AQWEIADRRDAIRHAVAWARPGDWLIAGKGHETGQRGGGRVRPFDDRVELAAALEALERRA 

>Rv2159c - TB.seq 2420632:2421663 MW:36377 SEQ ID NO:229 

MKFVNHIEPVAPRRAGGAVAEVYAEARREFGRLPEPLAMLSPDEGLLTAGWATLRETLLVGQVPRG 
RKEAVAAAVAASLRCPWCVDAHTTMLYAAGQTDTAAAILAGTAPAAGDPNAPYVAWAAGTGTPAGP 
PAPFGPDVAAEYLGTAVQFHFIARLVLVLLDETFLPGGPRAQQLMRRAGGLVFARKVRAEHRPGRST 
RRLEPRTLPDDLAWATPSEPIATAFAALSHHLDTAPHLPPPTRQWRRWGSWHGEPMPMSSRWTN 
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EHTAELPADLHAPTRU\LLTG1_APHQVTDDDVAAARSLLDTDAALVGALAWAAFTAARRIGTWIGAAA 
EGQVSRQNPTG 

>Rv2163c pbpB TB.seq 2425049:2427085 MW:72506 SEQ ID NO:230 
5 VSRAAPRRASQSQSTRPARGLRRPPGAQEVGQRKRPGKTQKARQAQEATKSRPATRSDVAPAGR 

STRARRTRQWDVGTRGASFVFRHRTGNAVILVLMLVAATQLFFLQVSHAAGLRAQAAGQLKVTDV 

QPAARGSIVDRNNDRLAFTIEARALTFQPKRIRRQLEEARKKTSAAPDPQQRLRDIAQEVAGKLNNKP 

DAAAVLKKLQSDETFVYl^VDPAVASAICAKYPEVGAERQDLRQYPGGSLAANWG 

LLGLEDSLDAVLAGTDGSVTYDRGSDGWIPGSYRNRHKAVHGSTWLTLDNDIQFYVQQQVQQAK 

10 NLSGAHNVSAWLDAKTGEVLAMANDNTFDPSQDIGRQGDKQLGNPAVSSPFEPGSVNKIVAASAVI 

EHGLSSPDEVLQVPGSIQMGGVWHDAWEHGVMPYTTTGVFGKSSNVGTLMLSQRVGPERYYDML 

RKFGLGQRTGVGLPGESAGLVPPIDQWSGSTFANLPIGQGLSMTLLQMTGMYQAIANDGVRVPPRII 

KATVAPDGSRTEEPRPDDIRWSAQTAQTVRQMLRAWQRDPMGYQQGTGPTAGVPGYQMAGKT 

GTAQQINPGCGCYFDDVYWITFAGIATADNPRYVIGIMLDNPARNSDGAPGHSAAPLFHNIAGWLMQ 

15 RENVPLSPDPGPPLVLQAT 

>Rv2165c - TB.seq 2428236:2429423 MW:42498 SEQ ID NO:231 

VQTRAPWSLPEATLAYFPNARFVSSDRDLGAGAAPGIAASRSTACQTWGGITVADPGSGPTGFGHV 
PVLAQRCFELLTPALTRYYPDGSQAVLLDATIGAGGHAERFLEGLPGLRLIGLDRDPTALDVARSRLV 
20 RFADRLTLVHTRYDCLGAALAESGYAAVGSVDGILFDLGVSSMQLDRAERGFAYATDAPLDMRMDP 
TTPLTAADIVNTYDEAALADILRRYGEERFARRIAAGIVRRRAKTPFTSTAELVALLYQAIPAPARRVGG 
HPAKRTFQALRIAVNDELESLRTAVPAALDALAIGGRIAVLAYQSLEDRIVKRVFAEAVASATPAGLPV 
ELPGHEPRFRSLTHGAERASVAEIERNPRSTPVRLRALQRVEHRAQSQQWATEKGDS 

25 >Rv2166c - TB.seq 2429428:2429856 MW:15912 SEQ ID NO:232 

MFLGTYTPKLDDKGRLTLPAKFRDALAGGLMVTKSQDHSLAVYPRAAFEQLARRASKAPRSNPEAR 

AFLRNLAAGTDEQHPDSQGRITLSADHRRYASLSKDCWIGAVDYLEIWDAQAWQNYQQIHEENFSA 
ASDEALGDIF 

>Rv2197c - TB.seq 2461505:2462146 MW:22481 SEQ ID NO:233 

MVSRYSAYRRGPDVISPDVIDRILVGACAAVWLVFTGVSVAAAVALMDLGRGFHEMAGNPHTTVVVL 
YAVIWSALVIVGAIPVLLRARRMAEAEPATRPTGASVRGGRSIGSGHPAKRAVAESAPVQHADAFEV 
AAEWSSEAVDRIWLRGTWLTSAIGIALIAVAAATYLMAVGHDGPSWISYGLAGWTAGMPVIEVVLYA 

RQLRRWAPQSS 

>Rv2198c -TB.seq 2462149:2463045 MW:30955 SEQ ID NO:234 

MSGPNPPGREPDEPESEPVSDTGDERASGNHLPPVAGGGDKLPSDQTGETDAYSRAYSAPESEHV 
TGGPWPADLRLYDYDDYEESSDLDDELAAPRWPWWGVAAIIAAVALWSVSLLVTRPHTSKLATG 
DTTSSAPPVQDEITTTKPAPPPPPPAPPPTTEIPTATETQTVTVTPPPPPPPATTTAPPPATTT^ 
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PPTTTTPTGPRQ\m'S\n-GTKAPGDIIS\mVDAAGRRRTQHNWIPWSMTVTPISQSDVGSVEASSL 
FRVSKLNCSITTSDGTVLSSNSNDGPQTSC 

>Rv21 99c - TB.seq 2463234:2463650 MW:14866 SEQ ID NO:235 
5 MHIEARLFEFVAAFFWTA\A.YG\A.TSMFATGGVEWAGTTALALTGGMALIVATFFRFVARRLDSRPE 
DYEGAEISDGAGELGFFSPHSVVWPIMVALSGSVAAVGIALWLPWLIAAGVAFILASAAGLVFEYYVGP 

EKH 

>Rv2200c ctaC TB.seq 2463661:2464749 MW:40449 SEQ ID NO:236 
10 VTPRGPGRLQRLSQCRPQRGSGGPARGLRQLALAAMLGALAVTVSGCSWSEALGIGWPEGITPEA 
HLNRELWIGAVIASU\VGVIVWGLIFWSAVFHRKKNTDTELPRQFGYNMPLELVLT\^PFLIISVLFYFT 
\AA^QEKMLQIAKDPE\A/IDITSFQWNWKFGYQRVNFKDGTLTYDGADPERKRAMVSKPEGKDKYGE 
ELVGPVRGLNTEDRTYLNFDKVETLGTSTEIPVLVLPSGKRIEFQMASADVIHAFVVVPEFLFKRDVMP 
NPVANNSVNVFQIEEIWGAFVGHCAEMCGTYHSMMNFEVRWTPNDFKAYLQQRIDGKTNAEALR 

15 AINQPPLAVTTHPFDTRRGELAPQPVG 

>Rv2427c proA g-glutamyl phosphate reductase TB.seq 2724231 :2725475 MW:43746 
SEQ ID NO:237 

MWPAPSQLDLRQEVHDAARRARVAARRUVSLPTTVKDRALHAAADELLAHRDQILAANAEDLNAAR 
20 EADTPAAMLDRLSLNPQRVDGIAAGLRQVAGLRDPVGEVLRGYTLPNGLQLRQQRVPLGWGMIYE 
GRPNNnVDAFGLTLKSGNAALLRGSSSAAKSNEALVAVLRTALVGLELPADAVQLLSAADRATVTHLI 
QARGLVDWIPRGGAGLIEAWRDAQVPTIETGVGNCHVYVHQAADLDVAERILLNSKTRRPSVCNA 
AETLLVDAAIAETALPRLIJ^ALQHAGVTVHLDPDEADLRREYLSLDIAVAWDGVDAAIAHINEYGTGH 
TEAIVTTNLDAAQRFTEQIDAAAVMVNASTAFTDGEQFGFGAEIGISTQKLHARGPMGLPELTSTKWI 

25 AWGAGHTRPA 

>Rv2438c - similar to YHN4.YEAST P38795 TB.seq 2734793:2737006 MW:80492 
SEQ ID NO:238 

MGLLGGQSGPRVGSGPVGSIPTPVNAAICQQRGGFHGVERGYSAGDSGVLTSLGDNERTMNFYSA 

30 YQHGFVRVAACTH HTTIGDPAAN AASVLDMARACH DDGAALAVFPELTLSGYSI EDVLLQDSLLDAV 
EDALLDLVTESADLLPVLWGAPLRHRHRIYNTAWIHRGAVLGWPKSYLPTYREFYERRQMAPGD 
GERGTIRIGGADVAFGTDLLFAASDLPGFVLHVEICEDMFVPMPPSAEAALAGATVLANLSGSPITIGR 
AEDRRLLARSASARCLAAYVYAAAGEGESTTDLAWDGQTMIWENGALLAESERFPKGVRRSVADVD 
TELLRSERLRMGTFDDNRRHHRELTESFRRIDFALDPPAGDIGLLREVERFPFVPADPQRLQQDCYE 

35 AYNIQVSGLEQRLRALDYPKWIGVSGGLDSTHALIVATHAMDREGRPRSDILAFALPGFATGEHTKN 
NAIKLARALGVTFSEIDIGDTARLMLHTIGHPYSVGEKVYDVTFENVQAGLRTDYLFRIANQRGGIVLG 
TGDLSELALGWSTYGVGDQMSHYNVNAGVPKTLIQHLIRWVISAGEFGEKVGEVLQSVLDTEITPELI 
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PTGEEELQSSEAKVGPFALQDFSLFQVLRYGFRPSKIAFLAWHAWNDAERGNWPPGFPKSERPSYS 
LAEIRHWLQIFVQRFYSFSQFKRSALPNGPKVSHGGALSPRGDWRAPSDMSARIWLDQIDREVPKG 

>Rv2439c proB glutamate 5-kinase TB.seq 2737118:2738245 MW:38789 SEQ ID NO:239 

MRSPHRDAIRTARGLWKVGTTALTTPSGMFDAGRLAGLAEAVERRMKAGSDVVIVSSGAIAAGIEPL 

GLSRRPKDU^TKOAAASVGQVALVNSWSAAFARYGRWGQVLLTAHDISMRVQHTNAQRTLDRLRA 

LHAVAIVNENDTVATNEIRFGDNDRLSALVAHLVGADALVLLSDIDGLYDCDPRKTADATFIPEVSGPA 

DLDGWAGRSSHLGTGGMASKVAAALLAADAGVPVLLAPAADAATALADASVGTVFAARPARLSAR 

RFVVVRYAAEATGALTLDAGAVRAWRQRRSLLAAGITAVSGRFCGGDWELRAPDAAMVARGWAY 

DASELATMVGRSTSELPGELRRPWHADDLVAVSAKQAKQV 

>Rv2440c obg Obg GTP-binding protein TB.seq 2738248:2739684 MW:50430 
SEQ ID NO:240 

VPRFVDRWIHTRAGSGGNGCASVHREKFKPLGGPDGGNGGRGGSIVFWDPQVHTLLDFHFRPHL 

TAASGKHGMGNNRDGAAGADLEVKVPEGTWLDENGRLLADLVGAGTRFEAAAGGRGGLGNAALA 

SRVRKAPGFALLGEKGQSRDLTLELIOVADVGLVGFPSAGKSSLVSAISAAKPKIADYPFTTLVPNLG 

WSAGEHAFTVADVPGLIPGASRGRGLGLDFLRHIERCAVLVHWDCATAEPGRDPISDIDALETELA 

CYTPTLQGDAALGDLAARPRAVVLNKIDVPEARELAEFVRDDIAQRGWPVFCVSTATRENLQPLIFGL 

SQMISDYNAARPVAVPRRPVIRPIPVDDSGFTVEPDGHGGFWSGARPERWIDQTNFDNDEAVGYL 

ADRLARLGVEEELLRLGARSGCAVTIGEMTFDWEPQTPAGEPVAMSGRGTDPRLDSNKRVGAAER 

KAARSRRREHGDG 

>Rv2441 c rpmA 50S ribosomal protein L27 TB.seq 2739773:2740030 MW:8969 
SEQIDNO:241 

MAHKKGASSSRNGRDSAAQRLGVKRYGGQWKAGEILVRQRGTKFHPGVNVGRGGDDTLFAKTAG 
AVEFGIKRGRKTVSIVGSTTA 

>Rv2442c rplU 50S ribosomal protein L21 TB.seq 2740048:2740359 MW:1 1 152 
SEQ ID NO:242 

MMATYAIVKTGGKQYKVAVGDWKVEKLESEQGEKVS^ 
HTKGPKIRIHKFKNKTGYHKRQGHRQQLTVLKVTGIA 

>Rv2448c valS valyl-tRNA synthase TB.seq 2747596:2750223 MW:97822 SEQ ID NO:243 

MLPKSWDPAAMESAlYQKV\^DAGYFTADPTSTKPAYSIVLPPPNVTGSLHMGHALEHTMMDALTRR 

KRMQGYEVLWQPGTDHAGIATQSWEQQLAVDGKTKEDLGRELFVDKVWDWKRESGGAIGGQMR 

RLGDGVDWSRDRFTMDEGLSRAVRTIFKRLYDAGLIYRAERLVNWSPVLQTAISDLEVNYRDVEGEL 

VSFRYGSLDDSQPHIWATTRVETMLGDTAIAVHPDDERYRHLVGTSLAHPFVDRELAIVADEHVDPE 
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FGTGAVK\n"PAHDPNDFEIGVRHQLPMPSILDTKGR1VDTGTRFDGMDRFEARVAVRQALAAQGRV 

VEEKRPYLHSVGHSERSGEPIEPRLSLQWWVRVESLAKAAGDAVRNGDTVIHPASMEPRWFSWVD 

DMHDWCISRQLVVWGHRIPIWYGPDGEQVCVGPDETPPQGWEQDPDVLDTWFSSALWPFSTLGW 

PDKTAELEKFYPTSVLVTGYDILFFWVARMMMFGTFVGDDAAITLDGRRGPQVPFTDVFLHGLIRDE 

SGRKMSKSKGNMDPLDWVEMFGADALRFTLARGASPGGDLAVSEDAVRASRNFGTKLFNATRYAL 

LNGAAPAPLPSPNELTDADRWILGRLEEVRAEVDSAFDGYEFSRACESLYHFAWDEFCDVVYLELAK 

TQU^QGLTHTTAVLAAGLDTLLRU-HPVIPFLTEALWl^LTGRESLVSADWPEPSGISVDLVAAQRIND 

MQKLWEVRRFRSDQGl^DRQKVPARMHGVRDSDLSNQVAAVTSLAVVLTEPGPDFEPSVSLEVRL 

GPEMNRTVWELDTSGT1DVAAERRRLEKELAGAQKELASTAAKI.ANADFLAKAPDAVIAKIRDRQRV 

AQQETERITTRLAALQ 



>Rv2482c plsB2 TB.seq 2786915:2789281 MW:88284 SEQ ID NO:244 

VTKPAADASAVLTAEDTLVLASTATPVEMELIMGWLGQQRARHPDSKFDILKLPPRNAPPAALTALVE 
QLEPGFASSPQSGEDRSIVPVRVIWLPPADRSRAGKVAALLPGRDPYHPSQRQQRRILRTDPRRAR 
WAGESAKVSELRQQWRDTWAEHKRDFAQFVSRRALLALARAEYRILGPQYKSPRLVKPEMLASA 
RFRAGLDRIPGATVEDAGKMLDELSTGWSQVSVDLVSVLGRLASRGFDPEFDYDEYQVAAMRAALE 
AHPAVLLFSHRSYIDGVWPVAMQDNRLPPVHMFGGINLSFGLMGPLMRRSGMIFIRRNIGNDPLYK 
YVLKEYVGYWEKRFNLSWSIEGTRSRTGKMLPPKLGLMSYVADAYLDGRSDDILLQGVSICFDQLH 
EITEYAAYARGAEKTPEGLRWLYNFIKAQGERNFGKIYVRFPEAVSMRQYLGAPHGELTQDPAAKRL 
20 ALQKMSFEVAWR.LQATPVTATGLVSALLLTTRGTALTLDQLHHTLQDSLDYLERKQSPVSTSALRLR 
SREGVRAAADALSNGHPVTRVDSGREPN^APDDEHAAAF^RNSVIHAFLETSIVELALAHAKHAE 
GDRVAAFWAQAMRLRDLLKFDFYFADSTAFRANIAQEMAWHQDWEDHLGVGGNEIDAMLYAKRPL 
MSDAMLRVFFEAYE.VADVLRDAPPD.GPEELTELALGLGRQFVAQGRVRSSEPVSTLLFATARQVAV 
DQELIAPAADLAERRVAFRRELRNILRDFDYVEQIARNQFVACEFKARQGRDRI 



>Rv2509 - putative oxidoreductase TB.seq 2824676:2825479 MW:28014 SEQ ID NO:245 

MPIPAPSPDARAWTGASQNIGAALATELAARGHHLIVTARREDVLTEL^ARLADKYRVTVDVRPADL 

ADPQERSKLADELAARP.S.LCANAGTATFGPIASLDLAGEKTQVQLNAVAVHDLTLAVLPGM.ERKAG 

GILISGSAAGNSPIPYNATYAATKAFVNTFSESLRGELRGSGVHVTVLAPGPVRTELPPASEASLVEKL 

VPDFLWISTEHTARVSLNALERNKMRWPGLTSKAMSVASQYAPRAIVAPIVGAFYKRLGGS 



>RV2524C fas fatty acid synthase TB.seq 2840124:2849330 MW:326226 SEQ ID NO:246 
vTIHEHDRVSADRGGDSPHTTHALVDRLMAGEPYAVAFGGQGSAWLETLEELVSATGIETELATLVG 

EAELLLDFVTDELIWRPIGFEPLQVWRALAAEDPVPSDKHLTSAAVSVPGN^LTQIAATRALARQGM 
35 DLVATPPVAMAGHSQGVLAVEALKAGGARDVELFALAQLIGAAGTLVARRRGISVLGDRPPMVSVTN 
" ADPERIGRLLDEFAQDVRTVLPPVLSIRNGRRAWITGTPEQLSRFELYCRQISEKEEADRKNKVRGG 

DVFSPVFEPVQVEVGFHTPRLSDGIDIVAGWAEKAGLDVALARELADAILIRKVDWVDEITRVHAAGA 
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RWILDLGPGDILTRLTAPVIRGLGIGIVPAATRGGQRNLFTVGATPEVARAWSSYAP7WRLPDGRVK 

LSTKFTRLTGRSPILLAGMTPTTVDAKIVAAAANAGHWAELAGGGQVTEEIFGNR1EQMAGLLEPGRT 

YQFNALFLDPYLWKLQVGGKRLVQKARQSGAAIDGWISAGIPDLDEAVELIDELGDIGISHWFKPGT 

IEQIRSVIRIATEVPTKPVIMHVEGGRAGGHHSWEDLDDLLLATYSELRSRANIWCVGGGIGTPRRAA 

EYLSGRWAC^YGFPLMPIDGILVGTAAMATKESTTSPSWRMLVDTQGTDQWISAGKAQGGMASSR 

SQLGADIHEIDNSASRCGRLLDEVAGDAEAVAERRDEIIAAMAKTAKPYFGDVADMTYLQWLRRYVE 

LAIGEGNSTADTASVGSPWLADTWRDRFEQMLQRAEARLHPQDFGPIQTLFTDAGLLDNPQQAIAAL 

LARYPDAEWQLHPADVPFFVTLCKTLGKPVNFWVIDQDV 

GTASVAGITRMDEPVGELLDRFEQAAIDEVLGAGVEPKDVASRRLGRADVAGPLAWLDAPDVRWA 

GRTVTNPVHRIADPAEWQVHDGPENPRATHSSTGARLQTHGDDVALSWVSGTVWDIRFTLPANTV 

(XBGTPVIATEDATSAMRT^IAAGVDSPEFLPAVANGTATLWDWHPERVADHTGVTATFGEPLAP 

SLTNVPDALVGPCWPAVFAAIGSA\nT3TGEP\A^GLLSLN^LDHAARWGQLPTVPAQL7VTATAAN 

ATDTDMGRWPVSVWTGADGAVIATLEERFAILGRTGSAELADPARAGGAVSANATDTPRRRRRDV 

TITAPVDMRPFAWSGDHNPIHTDRAAALIAGLESPIVHGMVVLSAAAQHAVTATDGQARPPARLVG 

WTARFLGMVRPGDEVDFRVERVGIDQGAEIVDVAARVGSDLVMSASARLAAPKTVYAFPGQGIQHK 

GMGMEVRARSKAARKVWDTADKFTRDTLGFSVLHWRDNPTSIIASGVHYHHPDGVLYL^ 

MATVAAAQVAEMREQGAFVEGAIACGHSVGEYTALACVTGIYQLEALLEMVFHRGSKMHDIVPRDEL 

GRSNYRLAAIRPSQIDLDDADVPAFVAGIAESTGEFLEIVNFNLRGSQYAIAGTVRGLEALEAEVERRR 

ELTGGRRSFILVPGIDVPFHSRVLRVGVAEFRRSLDRVMPRDADPDLIIGRYIPNLVPRLFTLDRDFIQ 

EIRDLVPAEPLDEILADYDTWLRERPREMARTVFIELLAWQFASPVRWIETQDLLFIEEAAGGLGVERF 

VE1GVKSSPTVAGLATNTLKLPEYAHSTVEVLNAERDAAVLFATDTDPEPEPEEDEPVAESPAPDWS 

EAAPVAPAASSAGPRPDDLVFDAADATLALIALSAKMRIDQIEELDSIESITDGASSRRNQLLVDLGSE 

LNLGAIDGAAESDLAGLRSQVTKLARTYKPYGPVLSDAINDQLRTVLGPSGKRPGAIAERVKKTWELG 

EGWAKHXm/EVALGTREGSSVRGGAMGHLHEGALADAASVDKVIDAAVASVAARQGVSVALPSAG 

SGGGATIDAAALSEFTDQITGREGVLASAARLVLGQLGLDDPVNALPAAPDSELIDLVTAELGADWPR 

LVAPVFDPKKAWFDDRWASAREDLVKLWLTDEGDIDADWPRLAERFEGAGHWATQATWWQGKS 

LAAGRQIHASLYGRIAAGAENPEPGRYGGEVAWTGASKGSIAASWARLLDGGATVIATTSKLDEER 

LAFYRTLYRDHARYGAALWLVAANMASYSDVDALVEWIGTEQTESLGPQSIHIKDAQTPTLLFPFAAP 

RWGDLSEAGSRAEMEMKVLLWAVQRLIGGLSTIGAERDIASRLHWLPGSPNRGMFGGDGAYGEA 

KSALDAWSRWHAESSWAARVSU\HALIGV\rTRGTGLMGHNDAIVAAVEEAGVTTYSTDEMAALLLD 

LCDAESKVAAARSPIKADLTGGLAEANLDMAELAAKAREQMSAAAAVDEDAEAPGAIAALPSPPRGF 

TPAPPPQWDDLDVDPADLWIVGGAEIGPYGSSRTRFEMEVENELSAAGVLELA\AnTGLIRWEDDP 

QPGWYDTESGEMVDESELVQRYHDAWQRVGIREFVDDGAIDPDHASPLLVSVFLEKDFAFWSSE 

ADARAFVEFDPEHTVIRPVPDSTDWQVIRKAGTEIRVPRKTKLSRWGGQIPTGFDPTVWGISADMA 

GSIDRLAVWNMVATVDAFLSSGFSPAEVMRYVHPSLVANTQGTGMGGGTSMQTMYHGNLLGRNKP 

ND1FQEVLPNIIAAHWQSYVGSYGAMIHPVAACATAAVSVEEGVDKIRLGKAQLWAGGLDDLTLEGII 

GFGDMAATADTSMMCGRGIHDSKFSRPNDRRRLGFVEAQGGGTILLARGDLALRMGLPVLAWAFA 
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QSFGDGVHTStPAPGLGALGAGRGGKDSPLARALAKLGVAADDVAVISKHDTSTLANDPNETELHER 
U^ALG^SEGAPLFWSQKSLTGHAKGGAAVFQMMGLCQItRDGVlPPNRSLDCVDDELAGSAHFV 
^RLGGKFPLKAGMLTSLGFGHVSGLVALVHPQAFiASLDPAQRADVQRRADARLLAGQRRL 
ASAIAGGAPMYQRPGDRRFDHHAPERPQEASMLLNPAARLGDGEAYIG 

>Rv2555c aiaS atenyMRNA synthase TC.se, 2873772:2876483 MW:97326 SEQ ID NO:247 
VQ^HEIRKRFLDHFVKAGHTEVPSASVlLDDPNLLFVNAGMVQFVPFFLGQRTPPYPTATSiQKClRTP 

D1DEVGITTRHNTFFQMAGNFSFGDYFKRGAIELAWALLTNSLAAGGYGLDPER1WTTVYFDDDEAV 

^7glpaer 1Q rrgmad^^ 

WMQNERGEGTTKEDYQILGPLPRKNIDTGMGVERIALVLQDVHNVYETOLLRPV1DWARVAARAYD 
VG^E0DVRYRI1ADHSRTAA1LIGDGVSF<3NDGRGYV1JWU^RVIRSAKLLG1DAA1VGDLMATVRN 
AMGPSYPE^ADFERISRIAVAEETAFNRTLASGSRLFEEVASSTKKSGATVLSGSDAFTI^DTYG^I 

™aaetglqv D bgfp^QRRRakadaaarkhahadlsayreuvdagateftgfdb.rs 

QATOLGIFVOGKRVPWAHGVAGGAGEGQRVELVLDRTPLYAESGGQIADEG7ISGTGSSEAARAAV 

TDVQKIAKTI.WVHRVNVESGEFVEG07VIAAVDPGWRRGA7QGHSGTHMVHAALRQVLGPNAVQA 

G^NRP^^LRF^3WWQGPL7DDQKTQVEE\n'NEAVQADFEVRTFTEQLDKAKAMGAIALFGESYPD 

^FWVEMGGPFSLELCGGTHVSNTAQIGPVTILGESSIGSGVRRVEAYVGLOSFRHLAKERALMAGL 

A^LKVPSEEVPARVANLVERLRAAEKELERVRMASARAAATNAAAGAQRIGNVRLVAQRMSGGMT 

AAOLJ^LK^^F^iLGSEPAWALIAEGESQTVPYAVAANPAAQDLGIRANDLVKQLAVAVEGRGGGK 

ADLAQGSGKNPTGtDAALDAVRSEIAVlARVG 

>Rv2S60ch1sS hlstidyHRNA symhaseTB.se, 2904822:2906090 MW:451 18 SEQ ID NO:248 
VTCF^^SAR<GVPDYVPPDSAQFVAVROGLLAAARQAGYSHIELPIFEDTALFARGVGESTDVVSKE 

^r^DA^^ADAGFRSLGLDGFRLE^GDESCRPOYRELUQEFLFGLDLDEDT^G, 
^^DDKRPELRAMTASAP^CmLSDVAKWFDT^LDALGVPYV.NPRMVRGLDYYTKTAF 

ef^lg1qsg,ggggry^^^ 

'[^^GRLRMGVRVDU.YGDRGtKGAMRAAARSGARVALVAGDRDiEAGTVAVKDL 
TTGEQVSVSMDSWAEV1SRLAG 

>Rv2614ChrS tt , re on»l- t RNAsynthaseT B . S e,2941190:2943265MW:77123 SEQIDNO:249 

V^TOHPVAANTDDGRSVIRHSTAHVLAQAVQELFPQAKLGIGPPITDGFYYDFDVPEPFTPEDLAALE 

KRMRQIWEG^FDRRVYESTEOARAELANEPYKLEL^DKSGDAE.MEVGGDELTAYDNLNPRTR 

EHVWGDLCRGPHIPTTKHIPAFKLTRSSAAYWRGDQKNASLQRIYGTAWESQEAliDRHLEFIEEAQR 

^LvELDLFSFPDBGSGLAVFHPXGGWRRELEDYSRRKHTEAGYQFVNSPH.TKA^SG 

HLDWYADGMFPPMHIDAEYNADGSLRKPGQDYYLKPMNCPMHCUFRARGRSYRELPLRLFEFGTV 
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YRYEKSGWHGLTRVRGLTMDDAHIFCTRDQMRDELRSLLRFVLDLLADYGLTDFYLELSTKDPEKF 
VGAEEVWEEATTVLAEVGAESGLELVPDPGGAAFYGPKISVQVKDALGRTWQMSTIQLDFNFPERF 
GLEYTAADGTRHRPVMIHRALFGSIERFFGILTEHYAGAFPAWLAPVQWGIPVADEHVAYLEEVATQ 
LKSHGVRAEVDASDDRMAKKIVHHTNHKVPFMVLAGDRDVAAGAVSFRFGDRTQINGVARDDAVAA 
5 IVAWIADRENAVPTAELVKVAGRE 

>Rv2697c dut deoxyuridine triphosphatase TB.seq 3013683:3014144 MW:15772 SEQ ID NO:250 
VSTTLAIVRLDPGLPLPSRAHDGDAGVDLYSAEDVELAPGRRALVRTGVAVAVPFGMVGLVHPRSGL 
ATRVGLSIVNSPGT1DAGYRGEIKVALINLDPAAPIWHRGDRIAQLLVQRVELVELVEVSSFDEAGLAS 
10 TSRGDGGHGSSGGHASL 

>Rv2782c pepR protease/peptidase. M16 family (insulinase) TB.seq 3089045:3090358 MW:47074 
SEQIDNO:251 

MPRRSPADPAAAU^RRTTLPGGLRWTEFLPAVHSASVGVVVVGVGSRDEGAWAGAAHFLEHLLF 
1 5 KSTPTRSAVDIAQAMDAVGGELNAFTAKEHTCYYAHVLGSDLPLAVDLVADWLNGRCAADDVEVER 
DWLEEIAMRDDDPEDALADMFLAALFGDHPVGRPVIGSAQSVSVMTRAQLQSFHLRRYTPERMW 
AAAGNVDHDGLVALVREHFGSRLVRGRRPVAPRKGTGRVNGSPRLTLVSRDAEQTHVSLGIRTPGR 
GWEHRWALSVLHTALGGGLSSRLFQEVRETRGLAYSVYSALDLFADSGALSVYAACLPERFADVMR 
VTADVLESVARDGITEAECGIAKGSLRGGLVLGLEDSSSRMSRLGRSELNYGKHRSIEHTLRQIEQVT 
20 VEEVNAVARHLLSRRYGAAVLGPHGSKRSLPQQLRAMVG 

>Rv2783c gpsl pppGpp synthase and polyribonucleotide phosphorylase TB.seq 
3090339:3092594 MW:79736 SEQ ID NO:252 

MSAAEIDEGVFETTATIDNGSFGTRTIRFETGRLALQAAGAWAYLDDDNMLLSATTASKNPKEHFDF 
FPLTVDVEERMYAAGRIPGSFFRREGRPSTDAILTCRLIDRPLRPSFVDGLRNEIQIWTILSLDPGDLY 

25 DVLAINAASASTQLGGLPFSGPIGGVRVALIDGTWVGFP7VDQIERAVFDMWAGRIVEGDVAIMMVE 
AEATENWELVEGGAQAPTESWAAGLEAAKPFIAALCTAQQELADAAGKSGKPTVDFPVFPDYGED 
VYYSVSSVATDELAAALTIGGKAERDQRIDEIKTQWQRLADTYEGREKEVGAALRALTKKLVRQRILT 
DHFRIDGRGITDIRALSAEVAWPRAHGSALFERGETQILGVTTLDMIKMAQQIDSLGPETSKRYMHH 
YNFPPFSTGETGRVGSPKRREIGHGALAERALVPVLPSVEEFPYAIRQVSEALGSNGSTSMGSVCAS 

30 TLALLNAGVPLKAPVAGIAMGLVSDDIQVEGAVDGWERRFVTLTDILGAEDAFGDMDFKVAGTKDFV 
TALQLDTKLDGIPSQVLAGALEQAKDARLTILEVMAEAIDRPDEMSPYAPRVTTIKVPVDKIGEVIGPK 
GKVINAITEETGAQISIEDDGTVFVGATDGPSAQAAIDKINAIANPQLPWGERFLGTVVKTTDFGAFVS 
LLPGRDGLVHISKLGKGKRIAKVEDWNVGDKLRVEIADIDKRGKISLILVADEDSTAAATDAATVTS 
>Rv2793c truB tRNA pseudouridine 55 synthase TB.seq 3102364:3103257 MW:31821 

35 SEQ ID NO:253 

MSATGPGIWIDKPAGMTSHDWGRCRRIFATRRVGHAGTLDPMATGVLVIGIERATKILGLLTAAPKS 

YAATIRLGQTTSTEDAEGQVLQSVPAKHLTIEAIDAAMERLRGEIRQVPSSVSAIKVGGRRAYRLARQ 

181 



BNSDOCID: <W0 0135317A1 I > 



10 



15 



20 



30 



35 



PCT/US00/31152 

WO 01/35317 

GRSVQLEARPIRIDRFELLAARRRDQLIDIDVEIDCSSGTYIRALARDLGDALGVGGHVTALRRTRVGR 
FELDQARSLDDLAERPALSLSLDEACLLMFARRDLTAAEASAAANGRSLPAVGIDGVYAACDADGRV. 

ALLRDEGSRTRSVAVLRPATMHPG 

>Rv2797c - TB.seq 3105619:3107304 MW:58761 SEQ ID NO:254 

VPLWADIDRWNAQAVREVFHAASARAEVTFEASRQLAALSIFANSGGKTAEAAAHHNAGIRRDLDA 

HGNEALAVARAADRAADGIVKVQSELAALRHAAAAAELT1DALINRWPIPGLRSTEAQWARTLAKQT 

ELQAELDAIMAEANAVDEELASAVNMADGDAPIPADSGPPVGPEGLTPTQLASDANEERLREERARL 

QAHLERLQAEYDQLSVRAARDYHNGILDGDAVGRU^TDELSAARGRLGELDAVDEALSRAPETYL 

TQLQIPEDPNQQVU^AVAVGNPDTAANVSVTVPGVGSTTRGALPGMVTEARDLRSEVIRQLNAAGK 

PASVATIAWMGYHPPPNPLDTGSAGDLWQTMTDGQAHAGAADLSRYLQQVRANNPSGHLTVLGHS 

YGSLTASLALQDLDAQSAHPVNDWFYGSPGLELYSPAQLGLDHGHAYVMQAPHDLITNLVAPLAPL 

HGWGLDPYLTPGFTELSSQAGFDPGGIWRDGVYAHGDYPRSFLDAAGQPQLRMSGYNLAAIAAGL 

PDNTVGPPLLPPILGGGMPAAPGPALRGGR 



>Rv2864c ponA2 TB.seq 3175454:3177262 MW:63015 SEQ ID NO:255 

MVTKTTLASATSGLLLLAWAMSGCTPRPQGPGPAAEKFFAAI-AIGDTASAAQLSDNPNEAREALNA 
AWAGLQAAHLDAQVLSAKYAEDTGWAYRFSWHLPKDRIVynTDGQLKMARDEGRVWVRWTTSGL 
HPKLGEHQTFALRADPPRRASVNEVGGTDVLVPGYLYHYSLDAGQAGRELFGTAHAWGALHPFDD 
TLNDPQLLAE(^SSSTQPLDL\nT_HADDSNRVAAAIGQLPGWITPQAEU,PTDKHFAPAVLNDVKKA 
WDELDGKAGWRWSVNQNGVDVSVLHEVAPSPASSVSITLDRWQNAAQHAVhJTRGGKAMIWIK 
PSTGEILAIAQNAGADADGPVATTGLYPPGSTFKMITAGAAVERDLATPETLLGCPGEIDIGHRTIPNY 
GGFDLGWPMSRAFASSCNTTFAELSSRLPPRGLTQAARRYGIGLDYQVDGIT7VTGSVPPTVDLAE 
RTEDGFGQGKVLASPFGMALVAATVAAGKTPVPQLIAGRPTAVEGDATPISQKMIDALRPMMRLWT 
25 ngtAKEIAGCGEVFGKTGEAEFPGGSHSWFAGYRGDLAFASLIVGGGSSEYAVRMTKVMFESLPPG 

YLA 

>Rv2868c gcpE TB.seq 3179368:3180528 MW:40451 SEQ ID NO:256 

VWGLGMPQPPAPTLAPRRATRQLMVGNVGVGSDHPVSVQSMCTTKTHDVNSTLQQIAELTAAGC 
DIVRVACPRQEDADALAEIARHSQIPWADIHFQPRYIFAAIDAGCAAVRVNPGNIKEFDGRVGEVAKA 
. r^^ocMCio/ftKATPPALVPSALWEASLFEEHGFGDIKISVKHNDPWMVAAY 



DIVRVACPRQEDADALAEIARHSQIPWADIHhUKKYir-MMiuMoo««v^v„rw.,Mx, 

AGAAGIPIRIGVNAGSLDKRFMEKYGKATPEALVESALWEASLFEEHGFGDIKISVKHNDPWMVAAY 
ELLAARCDYPLHLGVTEAGPAFQGTIKSAVAFGALLSRGIGDTIRVSLSAPPVEEVKVGNQVLESLNL 
RPRSLEIVSCPSCGRAQVDVYTLANEVTAGLDGLDVPLRVAVMGCWNGPGEAREADLGVASGNGK 
GQIFVRGEVIKTVPEAQIVETLIEEAMRLAAEMGEQDPGATPSGSPIVTVS 
>Rv2869c - TB.seq 3180548:3181759 MW:42835 SEQ ID NO:257 

MMFVTGIVLFALAILISVALHECGHMVVVARRTGMKVRRYFVGFGPTLWSTRRGETEYGVKAVPLGG 

FCDIAGMTPVEELDPDERDRAMYKQATWKRVAVLFAGPGMNl_AICL\A-IYAIALVWGLPNLHPPTRAV 

IGETGCVAQEVSQGKLEQCTGPGPAALAGIRSGDVWKVGDTPVSSFDEMAAAVRKSHGSVPIWE 
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RDGTAI VTYVDI ESTQRWI PN GQGGELQPATVG Al GVGAARVGPVRYGVFSAMPATFAVTGDLTVEV 
GKAI^O^LPTKVGALVRAIGGGQRDPQTPISWGASIIGGDTVDHGLVWAFWFFLAQLNLILAAINLLPL 
LPFDGGHIAVAVFERIRNMVRSARGKVAAAPVNYLKLLPATYWLVLWGYMLLTVTADLVNPIRLFQ 

>Rv2870c - TB.s q 3181770:3183077 MW:45324 SEQ ID NO:258 

5 VATGGRWIRRRGDNEWAHNDEVTNSTDGRADGRLRVWLGSTGSIGTQALQVIADNPDRFEWG 
LAAGGAHLDTLLRQRAQTGVTNIAVADEHAAQRVGDIPYHGSDAATRLVEQTEADWLNALVGALGL 
RPTLAALKTGARLALANKESLVAGGSLVLRAARPGQIVPVDSEHSALAQCLRGGTPDEVAKLVLTAS 
GGPFRGWSAADLEHVTPEQAGAHPTWSMGPMNTLNSASLVNKGLEVIETHLLFGIPYDRIDVWHP 
QSIIHSMWFIDGSTIAQASPPDMKLPISLALGWPRRVSGAAAACDFHTASSVVEFEPLDTDVFPAVEL 

10 ARQAGVAGGCMTAVYNAANEEAAAAFLAGRIGFPAIVGIIADVLHAADQWAVEPATVDDVLDAQRWA 
RERAQRAVSGMASVAIASTAKPGAAGRHASTLERS 

>Rv2922c smc member of Smc1/Cut3/Cut14 family TB.seq 3234189:3238055 MW:139610 
SEQ ID NO:259 

VGAGSRFPLVDPLPSVGARPDRLRGQPRRRTRAGGRPGSARCVPEAAAAAAGRHDTGPRRQSRR 

15 RL VAVDGADHRVQRA\flWPL\mXSLTLKGFKSFAAPTTLRFEPGITAWGPNGSGKSNWDAl^VW 
MGEQGAICaRGGKMEDVIFAGTSSRAPLGRAEVTVSIDNSDNALPIEYTEVSITRRMFRDGASEYEIN 
GSSCRLMDVQELLSDSGIGREMHVIVGQGKLEEILQSRPEDRRAFIEEAAGVLKHRKRKEKALRKLDT 
MAANUVRLTDLTTELRRQLKPLGRQAEAAQRAAAIQADLRDARLRLAADDLVSRRAEREAVFQAEAA 
MRREHDEAAARLAVASEELAAHESAVAELSTRAESIQHTWFGLSALAERVDATVRIASERAHHLDIEP 

20 VAVSDTDPRKPEELEAEAQQVAVAEQQLLAELDAARARLDAARAELADRERRAAEADRAHLAAVRE 
EADRREGLARLAGQVETMRARVESIDESVARLSERIEDAAMRAQQTRAEFETVQGRIGELDQGEVG 
LDEHHERWAALRLADERVAELQSAERAAERQVASLRARIDALAVGLQRKDGAAWLAHNRSGAGLF 
GSIAQLVKVRSGYEAALAAALGPAADALAVDGLTAAGSAVSALKQADGGRAVLVLSDWPAPQAPQS 
ASGEMLPSGAQWALDLVESPPQLVGAMIAMLSGVAWNDLTEAMGLVEIRPELRAVTVDGDLVGAG 

25 WVSGGSDRKLSTLEVTSEIDKARSEI-AAAEALAAQLNAALAGALTEQSARQDAAEQALAALNESDTAI 
SAMYEQLGRLGQEARAAEEEWNRLLQQRTEQEAVRTQTLDDVIQLETQLRKAQETQRVQVAQPIDR 
C^ISAAADPJ^RGVEVEARLAVRTAEERANAVRGRADSLRRAAAAEREARVRAQQARAARLHAAAVA 
AAVADCGRLLAGRLHRAVDGASQLRDASAAQRQQRLAAMAAVRDEVNTLSARVGELTDSLHRDEL 
ANAQAALRIEQLEQMVLEQFGMAPADLITEYGPHVALPPTELEMAEFEQARERGEQVIAPAPMPFDR 

30 XTTQERRAKRAERALAELGRVNPLALEEFAALEERYNFLSTQLEDVKAARKDLLGWADVDARILQVFN 
DAFVDVEREFRGVFTALFPGGEGRLRLTEPDDMLTTGIEVEARPPGKKITRLSLLSGGEKALTAVAML 
VAIFRARPSPFYIMDEVEAALDDVNLRRLLSLFEQLREQSQIIIITHQKPTMEVADALYGVTMQNDGITA 

VISQRMRGQQVDQLVTNSS 

>Rv2925cmc RNAse III TB.seq 3239829:3240548 MW:25400 SEQ ID NO:260 
35 MIRSRQPLLDALGVDLPDELLSLALTHRSYAYENGGLPTNERLEFLGDAVLGLTITDALFHRHPDRSE 
GDLAKLRASWNTClALADVARRLCAEGLGVhIVLLGRGEANTGGADKSSILADGMESLLGAlYLQHGM 
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EKAREVILRLFGPLLDAAPTLGAGLDWKTSLQELTA 
EYGSGVGRSKKEAEQKAAAAAWKALEVLDNAMPGKTSA 

>Rv2934 ppsDTB.seq 3262245:3267725 MW:193317 SEQ ID NO:261 

MTSLAERAAQLSPNARAALARELVRAGTTFPTDICEPVAWGIGCRFPGNVTGPESFWQLLADGVDT 
IEQVPPDRWDADAFYDPDPSASGRMTTKWGGFVSDVDAFDADFFGITPREAVAMDPQHRMLLEVA 
WEALEHAGIPPDSLSGTRTGN^MGLSSWDYTIVNIERRADIDAYLSTGTPHCAAVGRIAYLLGLRGPA 
VAVDTACSSSLVAIHLACQSLRLRETDVALAGGVQLTLSPFTAIALSKWSALSPTGRCNSFDANADGF 
VRGEGCG\AA^KRl^AVRDQDR\^VVRGSATNSDGRSNGMTAPNALAQRDVITSAlJ<LAD\m»D 
SVNWETHGTGTXA-GDPIEFESl^TYGLGKGQGESPCALGSVKTNIGHLEAAAGVAGRKAVLAVQR 
GHIPRNLHFTRWNPAIDASATRLFWTESAPWPAAAGPRRAAVSSFGLSGTNAHWVEQAPDTAVAA 
AGGMPWSALNVSGKTAARVASAAAVLADWMSGPGAAAPLADVAHTLNRHRARHAKFATVIARDRA 
EAJAGLRALAAGQPRVGWDCDQHAGGPGRVFVYSGQGSQWASMGQQLLANEPAFAKAVAELDPI 
FVDQVGFSLQQTLIDGDEWGIDRIQPVLVGMQLALTELVVRSYGV1PDAV1GHSMGEVSAAWAGALT 
PEQGLRVITTRSRLMARLSGQGAMAU-ELDADAAEALIAGYPQVTl^VHASPRQTVIAGPPEQVDTVI 
AAVATQNRLARRVEVDVASHHPIIDPILPELRSALADLTPQPPSIPIISTTYESAQPVADADYWSANLRN 
PVRFHQANn-AAGVDHNTFIEISPHPVLTHALTDTLDPDGSHTVMSTMNRELDQTLYFHAQLAAVGVA 
ASEHTTGRLVDLPPTPWHHQRFVVVTDRSAMSELAATHPLLGAHIEMPRNGDHVWQTDVGTEVCPW 
LADHKVFGQPI MPAAGFAEI ALAAASEALGTAADAVAPNIVINQFEVEQMLPLDGHTPLTTQLI RGGDS 
20 QIRVEIYSRTRGGEFCRHATAKVEQSPRECAHAHPEAQGPATGTTVSPADFYALLRQTGQHHGPAF 

AALS Rl VRLADGS AETEISI PDEAPRHPGYRLH PWLD AALQSVG AAI PDGEI AGSAEASYLPVSFET1 R 
VYRDIGRHVRCRAHLTNLDGGTGKMGRIVLINDAGHIAAEVDGIYLRRVERRAVPLPLEQKIFDAEWT 
ESPIAAVPAPEPAAETTRGSVVLVLADATVDAPGKAQAKSMADDFVQQVVRSPMRRVHTADIHDESAV 

LAAFAETAGDPEHPPVGVWFVGGASSRLDDELAAARD™^ 

GGLSVADDEPGTPAAASLKGLVRVLAFEHPDMRTTLVDLDITQDPLTALSAELRNAGSGSRHDDVIA 
WRGERRFVERLSP^TIDVSKGHPWRQGASYWTGGLGGLGLWARWLVDRGAGRVVLGGRSDPT 
DEQCNVLAELQTRAEIWVRGDVASPGVAEKLIETARQSGGQLRGWHAAAVIEDSLVFSMSRDNLE 
RVWAPKATGALRMHEATADCELDWWLGFSSAASLLGSPGQAAYACASAVVLDALVGWRRASGLPA 

AVI NWGPWSEVGVAQALVGSVLDTISVAEGI EALDSLLAADRI RTGVARLRADRALVAFPEI RSISYFT 
QWEELDSAGDLGDWGGPDALADLDPGEARRAVTERMCARIAAVMGYTDQSTVEPAVPLDKPLTEL 
GLDSLMAVRIRNGARADFGVEPPVALILQGASLHDLTADLMRQLGLNDPDPALNNADTIRDRARQRA 

AARHGAAMRRRPKPEVQGG 
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>Rv2946c pksl TB.seq 3291503:3296350 MW:166642 SEQ ID NO:262 

VISARSAEALTAC^GRLMAHVQANPGLDPIDVGCSLASRSVFEHRAVWGASREQLIAGLAGLAAGE 
PGAGVAVGQPGSVGKTVWFPGQGAQRIGMGRELYGELPVFAQAFDAVADELDRHLRLPLRDVIW 
GADADLLDSTEFAQPALFAVEVASFAVLRDWGVLPDFVMGHSVGELAAAHAAGVLTLADAAMLWA 
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RGRLMQALPAGGAMVAVAASEDEVEPLLGEGVGIAAINAPESWISGAQAAANAIADRFAAQGRRVH 
QLAVSHAFHSPLMEPMLEEFARVAARVQAREPQLGLVSNVTGELAGPDFGSAQYWVDHVRRPVRF 
ADSARHLQTLGATHFIEAGPGSGLTGSIEQSI-APAEAMWSMLGKDRPELASALGAAGQVFTTGVPV 
QWSAVFAGSGGRRVQLPTYAFQRRRFWETPGADGPADAAGLGLGATEHALLGAWERPDSDE\A/L 
TGRLSLADQPWLADIHWNGWLFPGAGFVELVIRAGDEVGCALIEELVLAAPLVMHPGVGVQVQ 
GAADESGHRAVSVYSRGDQSQGWLLNAEGMLGVAAAETPMDLSVWPPEGAESVDISDGYAQLAE 
RGYAYGPAFQGLVA1WRRGSELFAEWAPGEAGVAVDRMGMHPAVLDAVLHALGLAVEKTQASTET 
RLPFCWRGVSLHAGGAGRVRARFASAGADAISVDVCDATGLPVLTVRSLVTRPITAEQLRAAVTAAG 
GASDQGPLEWWSPISWSGGANGSAPPAPVSWADFCAGSDGDASWVWELESAGGQASSWGS 
WAATHTALEVLQSWLGADRAATLWLTHGGVGLAGEDISDLAAAAVWGMARSAC^ENPGRIVLIDT 
DAAVDASVLAGVGEPQLLVRGGTVHAPRLSPAPALLALPAAESAWRLAAGGGGTLEDLV1QPCPEV 
QAPLQAGQVRVAVAAVGVNFRDWAALGMYPGQAPPLGAEGAGWLETGPEVTDLAVGDAVMGFL 
GGAGPUVWDQQLVTRVPQGWSFAQAAAVPWFLTAWY'G^ 

QLARQWGVEVFVTASRGKWDTLRAMGFDDDHIGDSRTCEFEEKFLAVTEGRGVDWLDSLAGEFV 
DASLRLLVRGGRFLEMGKTDIRDAQEIAANYPGVQYRAFDLSEAGPARMQEMLAEVRELFDTRELH 
RLPVTTWDVRCAPAAFRFMSQARHIGKWLTMPSALADRI^DGTWITGATGAVGGVLARHLVGAY 
GVRHLVLASRRGDRAEGAAELAADLTEAGAKVQWACDVADRAAVAGLFAQLSREYPPVRGVIHAA 
GVLDDAVITSLTPDRIDTVLRAKVDAAWNLHQATSDLDLSMFALCSSIAATVGSPGQGNYSAANAFLD 
GU^AHRQAAGLAGISLAWGLWEQPGGMTAHLSSRDLARMSRSGLAPMSPAEAVELFDAALAIDHPL 
AVATLLDRAALDARAOAGALPALFSGLARRPRRRQIDDTGDATSSKSALAQRLHGLAADEQLELLVG 
LVCLQAAAVLG RPSAEDVDPDTEFGDLGFDSLTAVELRNRLKTATGLTLPPTVI FDHPTPTAVAEYVA 
QQMSGSRPTESGDPTSQWEPAAAEVSVHA 

>Rv3014c ligA DNA ligase TB.seq 3372545:3374617 MW:75258 SEQ ID NO:263 
VSSPDADQTAPEVLRQWQALAEEVREHQFRYYVRDAPIISDAEFDELLRRLEALEEQHPELRTPDSP 

25 TQLVGGAGFATDFEPVDHLERMLSLDNAFTADELAAWAGRIHAEVGDAAHYLCELKIDGVALSLVYR 
EGRLTRASTRGDGRTGEDVTLNART1ADVPERLTPGDDYPVPEVLEVRGEVFFRLDDFQALNASLVE 
EGKAPFANPRNSAAGSLRQKDPAVTARRRLRMICHGLGHVEGFRPATLHQAYLALRAWGLPVSEHT 
TLATDLAGVRERIDYWGEHRHEVDHEIDGWVKVDEVALQRRLGSTSRAPRWAIAYKYPPEEAQTKL 
LDIRVNVGRTGRITPFAFMTPVKVAGSTVGQATLHNASEIKRKGVUGDTWIRKAGDVIPEVLGPWE 

30 LRDGSEREFIMPTTCPECGSPLAPEKEGDADIRCPNARGCPGQLRERVFHVASRNGLDIEVLGYEAG 
VALLQAKVIADEGELFALTERDU.RTDLFRTKAGELSANGKRLLVNLDKAKAAPLWRVLVALSIRHVGP 
TAARALATEFGSLDAIAAASTDQU^AVEGVGPTIAAAVTEWFAVDWHREIVDKWRAAGVRMVDERD 
ESVPRTLAGLTIWTGSLTGFSRDDAKEAIVARGGKAAGSVSKKTNYWAGDSPGSKYDKAVELGVPI 

LDEDGFRRLLADGPASRT 
35 >Rv3025c - NifS-like protein TB.seq 3383885:3385063 MW:40948 SEQ ID NO.264 

MAYLDHAATTPMHPAAIEAMAAVQRTIGNASSLHTSGRSARRRIEEARELIADKLGARPSEVIFTAGG 

TESDNLAVKGIYWARRDAEPHRRRI\nTEVEHHAVLDSVNWLVEHEGArf^«.PTAADGSVSATAL 
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REALQSHDDVALVSVMWANNEVGTILPIAEMSWAMEFGVPMHSDAIQAVGQLPLDFGASGLSAMS 
VAGHKFGGPPGVGALLLRRDVTCVPLMHGGGQERDIRSGTPDVASAVGMATAAQIAVDGLEENSAR 
LRLLRDRLVEGVLAEIDDVCLNGADDPMRLAGNAHFTFRGCEGDALLMLLDANGIECSTGSACTAGV 
AQPSHVLIAMGVDAASARGSLRLSLGHTSVEADVDAALEVLPGAVARARRAALAAAGASR 

>Rv3080c pknK serine-threonine protein kinase TB.seq 3442656:3445985 MW:1 19420 
SEQ ID NO.265 

I^DVDPHATRRDLVPNIPAELLEAGFDNVEEIGRGGFGWYRCVQPSLDRAVAVKVLSTDLDRDNLE 

RFLREQF^GRLSGHPHI\mA.QVGVLAGGRPFIVMPYHAKNSLETLIRRHGPLDWRETLSIGVKLA 

GALEAAHRVGTLHRDVKPGNILLTDYGEPQLTDFGIARIAGGFETATGVIAGSPAFTAPEVLEGASPTP 

ASDVYSLGATLFCALTGHAAYERRSGERVIAQFLRITSQPIPDLRKQGLPADVAAAIERAMARHPADR 

PATAADVGEELRDVQRRNGVSVDEMPLPVELGVERRRSPEAHAAHRHTGGGTPTVPTPPTPATKY 

RPSVPTGSLVTRSRLTDILRAGGRRRLILIHAPSGFGKSTLAAQWREELSRDGAAVAWLTIDNDDNNE 

NWFLSHLLESIRRVRPTLAESLGHVLEEHGDDAGRYVLTSLIDEIHENDDRIAWIDDWHRVSDSRTQ 

AALGFLLDNGCHHLQLIVn-SWSRAGLPVGRLRIGDELAEIDSAALRFDTDEAAALLNDAGGLRLPRAD 

VQALTTSTDGWAAALRUW.SLRGGGDATQLLRGLSGASDVIHEFLSENVLDTLEPELREFLLVASVT 

ERTCGGLASALAGITNGRAMLEEAEHRGLFLQRTEDDPNWFRFHQMFADFLHRRLERGGSHRVAEL 

HRRASAWFAENGYLHEAVDHAU^GDPARAVDLVEQDETNLPEQSKMTTU^VQKLPTSMVVSRA 

RLQLAIAWANILLQRPAPATGALNRFETALGRAELPEATQADLRAEADVLRAVAEVFADRVERVDDLL 

AEAMSRPDTLPPRVPGTAGr^AALAAICRFEFAEWPLLDWAAPYQEMMGPFGTVYAQCLRGMAAR 

NRLDIVAALQNFRTAFEVGTAVGAHSHAARLAGSLLAELLYETGDLAGAGRLMDESYLLGSEGGAVD 

YLAARYVIGARVKAAQGDHEGAADRLSTGGDTAVQLGLPRLAARINNERIRLGIALPAAVAADLLAPR 

TIPRDNGIATMTAELDEDSAVRLLSAGDSADRDQACQRAGALAAAIDGTRRPLAALQAQILHIETLAAT 

GRESDARNELAPVATKCAELGLSRLLVDAGLA 

>Rv3106 fprA adrenodoxin and NADPH ferredoxin reductase TB.seq 3474004:3475371 
MW:49342 SEQ ID NO:266 

MRPYYIAIVGSGPSAFFAAASLLKAADTTEDLDMAVDMLEMLPTPWGLVRSGVAPDHPK1KSISKQFE 

KTAEDPRFRFFGNWVGEHVQPGELSERYDAVIYAVGAQSDRMLNIPGEDLPGSIAAVDFVGWYNA 

HPHFEQVSPDLSGARAWIGNGNVALDVARILLTDPDVLARTDIADHALESLRPRGIQEWIVGRRGPL 

QAAFTTLELREU^LDGVDWIDPAELDGITDEDAAAVGKVCKQNIKVLRGYADREPRPGHRRMVFR 

FLTSPIEIKGKRKVERIVLGRNELVSDGSGRVAAKDTGEREELPAQLWRSVGYRGVPTPGLPFDDQ 

SGTIPNVGGRINGSPNEYWGWIKRGPTGVIGTNKKDAQDTVDTLIKNLGNAKEGAECKSFPEDHAD 

QVADWLAARQPKLVTSAHWQVIDAFERAAGEPHGRPRVKLASLAELLRIGLG 

>Rv3235 -TB.seq 3611296:3611934 MW:22659 SEQ ID NO:267 

MMASNQTAAQHSSATLQQAPRSIDDAGGCPLTISPIANSPGDTFAVTPWEYEPPPRNIPPCGQSSH 
AARRPHTPQLARRQPIRPSGRAPAAVTSTAKSPRLRQAGTFADAALRRVLEVIDRRRPVGQLRPLLA 
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PGLVDSVLAVSRTAAGHQQGAAMLRRIRLTPAGPDTADTAAEVFGTYSRGDRIHA1ACRVEQRPAGN 
ETRWLMVALHIG 

>Rv3255c manA mannose-6-phosphat isomerase TB.seq 3635040:3636263 MW:43340 
SEQ ID NO:268 

VELLRGALRTYAWGSRTAIAEFTGRPVPAAHPEAELWFGAHPGDPAWLQTPHGQTSLLEALVADPE 

GQLGSASRARFGDVLPFLVKVLAADEPLSLQAHPSAEQAVEGYLREERMGIPVSSPVRNYRDTSHK 

PELLVALQPFEAI^GFREAARTTELLRALAVSDLDPFIDLLSEGSDADGLRALFTTWITAPQPDIDVLV 

PAVLDGAIQWSSGATEFGAEAKTVLELGERYPGDAGVLAALLLNRISLAPGEA1FLPAGNLHAYVRG 

FGVEVMANSDNVLRGGLTPKHVDVPELLRVLDFAPTPKARLRPPIRREGLGLVFETPTDEFAATLLVL 

DGDHLGHEVDASSGHDGPQ1LLCTEGSATVHGKCGSLTLQRGTAAWVAADDGPIRLTAGQPAKLFR 

ATVGL 

>Rv3264c rmlA2 glucose-1 -phosphate thymidyltransferase TB.seq 3644897:3645973 MW:37840 
SEQ ID NO:269 

LATHQVDAWLVGGKGTRLRPLTLSAPKPMLPTAGLPFLTHLLSRIAAAGIEHVILGTSYKPAVFEAEF 

GDGSALGLQIEYVTEEHPLGTGGGIANVAGKLRNDTAMVFNGDVLSGADLAQLLDFHRSNRADVTL 

QLVRVGDPRAFGCVPTDEEDRWAFLEKTEDPPTDQINAGCYVFERNVIDRIPQGREVSVEREVFPA 

LLADGDCKIYGYN^ASYVVRDMGTPEDFVRGSADLVRGIAPSPALRGHRGEQLVHDGAAVSPGALU 

GGTWGRGAEIGPGTRLDGAVIFDGVRVEAGCVIERSIIGFGARIGPRALIRDGVIGDGADIGARCELL 

SGARVWPGVFLPDGGIRYSSDV 

>Rv3368c - TB.seq 3780334:3780975 MW:23734 SEQ ID NO:270 

MTT_NLSVDEVLTTTRSVRKRLDFDKPVPRDVLMECLEI_ALQAPTGSNSQGWQVVVFVEDAAKKKAIA 

DVYLANARGYLSGPAPEYPDGDTRGERMGRVRDSATYLAEHMHRAPVLLIPCLKGREDESAVGGVS 

FWASLFPAVWSFCLALRSRGLGSCWTTLHLLDNGEHKVADVLGIPYDEYSQGGLLPIAYTQGIDFRP 

AKRLPAESVTHWNGW 

>Rv3382c lytB1 TB.seq 3796447:3797433 MW:34667 SEQ ID NO:271 

MAEVFVGPVAQGYASGEVTVLLASPRSFCAGVERAIEWKRN^DVAEGPWRKQIVHNTVWAELR 

DRGAVFVEDLDEIPDPPPPGANAA/FSAHGVSPAVRAGADERGLQWDATCPLVAKVHAEAARFAAR 

GDTWFIGHAGHEETEGTLGVAPRSTLLVQTPADVAALNLPEGTQLSYLTQTTLALDETADVIDALRA 

RFPTLGQPPSEDICYATTNRQRALQSMVGECDWLVIGSCNSSNSRRLVELAQRSGTPAYLIDGPDDI 

EPEWLSSVSTIGVTAGASAPPRLVGQVIDALRGYASITWERSIATETVRFGLPKQVRAQ 

>Rv3418c groES 10 kD chaperone TB.seq 3836985:3837284 MW:1Q773 SEQ ID NO:272 
VAKVNIKPLEDKILVQANEAETTTASGLVIPDTAKEKPQEGTWAVGPGRWDEDGEKRIPLDVAEGDT 

VIYSKYGGTEIKYNGEEYLILSARDVLAWSK 
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>Rv3423c air TB.seq 3840193:3841416 MW:43357 SEQ ID NO:273 

VKRFWENVGKPNDTTDGRGTTSLAMTPISQTPGLLAEAMVDLGAIEHNVRVLREHAGHAQLMAVVK 

ADGYGHGATRVAQTALGAGAAELGVATVDEALALRADGITAPVLAVVLHPPGIDFGPALLADVQVAVS 

SLRQLDELLHAVRRTGRTATVTVKVDTGLNRNGVGPAQFPAMLTALRQAMAEDAVRLRGLMSHMV 

YADKPDDSINDVQAQRFTAFLAQAREQGVRFEVAHLSNSSATMARPDLTFDLVRPGIAVYGLSPVPA 

LGDMGLVPAMWKCAVALVKSIRAGEGVSYGHTWIAPRDTNLALLPIGYADGVFRSLGGRLEVLINGR 

RCPGVGRICMDQFMVDLGPGPLDVAEGDEAILFGPGIRGEPTAQDWADLVGTIHYEWTSPRGRITR 

TYREAENR 

>Rv3490 otsA [alpha] ,-trehalose-phosphate synthase TB.seq 3908232:3909731 MW:55864 
SEQ ID NO:274 

MAPSGGQEAQICDSETFGDSDFWVANRLPVDLERLPDGSTTWKRSPGGLVTALEPVLRRRRGAW 

VGWPGVNDDGAEPDLHVLDGPIIQDELELHPVRLSTTDIAQYYEGFSNATLWPLYHDVIVKPLYHRE 

WWDRWDVNQRFAEAASRAAAHGATVWVQDYQLQLVPKMLRMLRPDLTIGFFLHIPFPPVELFMQ 

MPWRTEIIQGLLGADLVGFHLPGGAQNFLILSRRLVGTDTSRGTVGVRSRFGAAVLGSRTIRVGAFPI 

SVDSGALDHAARDRNIRRRAREIRTELGNPRKILLGVDRLDYTKGIDVRLKAFSELLAEGRVKRDDTV 

WQU\TPSRERVESYQTLRNDIERQVGHINGEYGEVGHPWHYLHRPAPRDELIAFFVASDVMLVTP 

LRDGMNLVAKEYVACRSDLGGALVLSEFTGAAAELRHAYLVNPHDLEGVKDGIEEALNQTEEAGRR 

RMRSLRRQVLAHDVDRWAQSFLDALAGAHPRGQG 

>Rv3598c lysS lysyl-tRNA synthase TB.seq 4041423:4042937 MW:55678 SEQ ID NO:275 

VSAADTAEDLPEQFRI RRDKRARLLAQGRDPYPVAVPRTHTLAEVRAAHPDLPI DTATEDI VGVAGRV 

IFARNSGKLCFATLQDGDGTQLQVMISLDKVGQAALDAWKADVDLGDIVYVHGAVISSRRGELSVLA 

DCWRIAAKSLRPLPVAHKEMSEESRVRQRYVDLIVRPEARAVARLRIAWRAIRTALQRRGFLEVETP 

VLQTLAGGAAARPFATHSNALDIDLYLRIAPELFLKRCIVGGFDKVFELNRVFRNEGADSTHSPEFSM 

LETYQTYGTYDDSAVVTRELIQEVADEAIGTRQLPLPDGSVYDIDGEWATIQMYPSLSVALGEEITPQT 

TVDRLRGIADSLGLEKDPAIHDNRGFGHGKLIEELWERTVGKSLSAPTFVKDFPVQTTPLTRQHRSIP 

GVTEKWDLYLRGIELATGYSELSDPWQRERFADQARAAAAGDDEAMVLDEDFLAALEYGMPPCTG 

TGMGIDRLLMSLTGLSIRETVLFPIVRPHSN 

>Rv3600c - similar to Bacillus subtilis protein YacB TB.seq 4043041 :4043856 MW:29274 
SEQ ID NO:276 

VLLAIDVRNTHTWGLLSGMKEHAKWQQWRIRTESEVTADELALTIDGLIGEDSERLTGTAALSTVPS 
VLHEVRIMLDQYWPSVPHVLIEPGVRTGIPLLVDNPKEVGADRIVNCLAAYDRFRKAAIWDFGSSICV 
DWSAKGEFLGGAIAPGVQVSSDAAAARSAALRRVELARPRSWGKNTVECMQAGAVFGFAGLVDG 
LVGRIREDVSGFSVDHDVAIVATGHTAPLLLPELHTVDHYDQHLTLQGLRLVFERNLEVQRGRLKTAR 
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>Rv3606c folK 7.8-dihydro-6-hydroxymethylpterin pyrophosphokinase TB.seq 
4048181:4048744 MW:20732 SEQ ID NO:277 

IVTTRWLSVGSNLGDRLARLRSVADGLGDALIAASPIYEADPWGGVEQGQFLNAVLIADDPTCEPREW 

LRRAQEFERAAGRVRGQRWGPRNLDVDLIACYQTSATEALVEVTARENHLTLPHPLAHLRAFVLIPW 

IAVDPTAQLTVAGCPRPVTRLLAELEPADRDSVRLFRPSFDLNSRHPVSRAPES 

>Rv3607c * foIX may be involved in folate biosynthesis TB.seq 4048744:4049142 MW:14553 

MADRIELRGLTVHGRHGVYDHERVAGQRFVIDVTVWIDLAEAA^ 

PPPJ<LIEWGAEIADHVMDDQRVHAVEVAVHKPQAPIPQTFDDVAW1RRSRRGGRGWWPAGGAV 

>Rv3608c folP dihydropteroate synthase TB.seq 4049138:4049977 MW:28812 SEQ ID NO:278 

VSPAPVQVMGVLNVTDDSFSDGGCYLDLDDAVKHGLAMAAAGAGIVDVGGESSRPGATRVDPAVE 

TSRVIPWKELAAQGIWSIDTMRADVARAALQNGAQ^^NDVSGGRADPAMGPlJ-AEADVPVVv^_MH 

WRAVSADTPHVPVRYGNWAEVRADLLASVADAVAAGVDPARLVLDPGLGFAKTAQHNWAILHALP 

ELVATGIPV^VGASRKRFLGALLAGPDGVMRPTDGRDTATAV^SALAALHGAWGVRVHDVTIASVDAI 

KWEAWMGAERIERDG 

>Rv3609c folE GTP cyclohydrolase I TB.seq 4049977:4050582 MW:22395 SEQ ID NO:279 
MSQLDSRSASARIRVFDQQRAEAAVRELLYAIGEDPDRDGLVATPSRVARSYREMFAGLYTDPDSVL 

NTMFDEDHDELVLVKEIPMYSTCEHHLVAFHGVAHVGYIPGDDGRVTGLSKIARLVDLYAKRPQVQE 

RLTSQIADALMKKLDPRGVIWIEAEHLCMAMRGVRKPGSVTTTSAVRGLFKTNAASRAEALDLILRK 

>Rv3610c ftsH inner membrane protein, chaperone TB.seq 4050601:4052880 MW:81987 
MNRKNVTRTITAIAWVLLGWSFFYFSDDTRGYKPVDTSVAITQINGDNVXSAQIDDREQQLRLILKKG 

NNETDGSEKvlTKYPTGYAVDLFNALSAKNAKVSTWNQGSILGELLVYVLPLLLLVGLFVMFSRMQG 

GARMGFGFGKSRAKQLSKDMPKTTFADVAGVDEAVEELYEIKDFLQNPSRYQALGAKIPKGVLLYGP 

PGTGKTLLARAVAGEAGVPFFTISGSDFVEMFVGVGASRVRDLFEQAKQNSPCIIFVDEIDAVGRQR 

GAGLGGGHDEREQTLNQLLVEMDGFGDRAGVILIAATNRPDILDPALLRPGRFDRQIPVSNPDLAGR 

RAVLRVHSKGKPMAADADLDGLAKRWGMTGADLANVINEAALLTARENGTVITGPALEEAVDRVIG 

GPRRKGRIISEQEKKITAYHEGGHTLAAWAMPDIEPIYKvTILARGRTGGHAVAVPEEDKGLRTRSEMI 

AQLVFAMGGRAAEELVFREPTTGAVSDIEQATKIARSMVTEFGMSSKLGAVKYGSEHGDPFLGRTM 

GTQPDYSHEVAREIDEEVRKLIEAAHTEAWEILTEYRDVLDTLAGELLEKETLHRPELESIFADVEKRP 

RLTMFDDFGGRIPSDKPPIKTPGELAIERGEPWPQPVPEPAFKAAIAQATQAAEAARSDAGQTGHGA 

NGSPAGTHRSGDRQYGSTQPDYGAPAGWHAPGWPPRSSHRPSYSGEPAPTYPGQPYPTGQADP 

GSDESSAEQDDEVSRTKPAHG 

>Rv3671c - TB.seq 41 12322:4113512 MW:40722 SEQ ID NO:280 

MTPSQWLDIAV1AVAFIAAISGWRAGALGSMLSFGGVLLGATAGVLLAPHIVSQISAPRAKLFAALFLIL 
ALWVGEVAGWLGP^VRGAIRNRPIRLIDSVIGVGVQLVWLTAAVVLLAMPLTQSKEQPELAAAVKG 

SRVLARVN EAAPTWLKTVPKRLSALLNTSGLPAVLEPFSRTPVI PVASPDPALVN NPWAATEPSWKI 
RSLAPRCQKv^EGTGFVISPDRVMTNAHWAGSNNVTVYAGDKPFEATWSYDPSVDVAILAVPHLP 
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PPPLVFAAEPAKTGADVWLGYPGGGNFTATPARIREAIRLSGPDIYGDPEP\n"RDVYTIRADVEQGD 

SGGPLIDLNGQVLGWFGAA1DDAETGFVLTAGEVAGQLAKIGATQPVGTGACVS 

>Rv3682 ponA2 TB.seq 4121913:4124342 MW:84637 SEQ ID NO:281 

MPERLPAAIT\A.KLAGCCLI^SWATALTFPFAGGLGLMSNPJVSEWANGSAQLLEGQVPAVSTMVD 

AKGNTIAVVLYSQRRFEVPSDKIANTMKLAIVSIEDKRFADHSGVDWKGTLTGLAGYASGDLOTRGGS 

TLEQQWKNYQLLVTAQTDAEKRAAVETTPARKLREIRMALTLDKTFTKSEILTRYLNLVSFGNNSFG 

VQDAAQTYFGINASDLNWQQAAL1AGMVQSTSTLNPYTNPDGALARRNVVLDTMIENLPGEAEALR 

AAKAEPLGVLPQPNELPRGCIAAGDRAFFCDWQEYLSF^GISKEQVATGGYLIRTTLDPEVQAPVKA 

AIDKYASPNLAGISSVMSVIKPGKDAHKVLAMASNRKYGLDLEAGETMRPQPFSLVGDGAGSIFKIFT 

TAAALDMGMGINAQLDVPPRFQAKGLGSGGAKGCPKETWCWNAGNYRGSMNVTDALATSPr4TAF 

AKLISQVGVGRAVDMAJKLGLRSYANPGTARDYNPDSNESUVDFVKRQNLGSFTLGPIELNALELSNV 

AATLASGGVWCPPNPIDQLIDRNGNEVAVTTETCDQWPAGLANT1_ANAMSKDAVGSGTAAGSAGA 

AGWDLPMSGKTGTTEAHRSAGFVGFTNRYAAANYIYDDSSSPTDLCSGPLRHCGSGDLYGGNEPS 

RTWFAAMKPIANNFGEVQLPPTDPRYVDGAPGSRVPSVAGLDVDAARQRLKDAGFQVADQTNSVN 

SSAKYGEWGTSPSGQTIPGS1VTIQISNGIPPAPPPPPLPEDGGPPPPVGSQWEIPGLPPITIPLLAP 

PPPAPPP 

>Rv3721c dnaZX DNA polymerase lll.tgamma] (dnaZ) and t (dnaX) TB.seq 4164995:4166728 
MW:61892 SEQ ID NO:282 

VALYRKYRPASFAEWGQEHVTAPLSVALDAGRINHAYLFSGPRGCGKTSSARILARSLNCAQGPTA 
NPCGVCESCVSLAPNAPGSIDWELDAASHGGVDDTRELRDRAFYAPVQSRYRVFIVDEAHMVTTA 
GFNALLKIVEEPPEHLIFIFATTEPEKVLPT1RSRTHHYPFRLLPPRTMRALLARICEQEGVVVDDAVYP 
LVIRAGGGSPRDTLSX^DQLLAGAADTHVTYTRALGLLGVTDVALIDD 

DGGHDPRRFATDLLERFRDLIVLQSVPDAASRGWDAPEDALDRMREQAARIGRATLTRYAEWQA 

GLGEMRGATAPRLLLEWCARLLLPSASDAESALLQRVERIETRLDMSIPAPQAVPRPSAAAAEPKHQ 

PAREPRPN^PTPASSEPWAAVRSMWPWRDKVRLRSRTTEVMLAGAWRALEDNTLVLTHESAPL 

ARRLSEQRNADVLAEALKDALGVNWRVRCETGEPAAAASPVGGGANVATAKAVNPAPTANSTQRD 

EEEHMLAEAGRGDPSPRRDPEEVALELLQNELGARRIDNA 

>Rv3783 -TB.seq 4229255:4230094 MW:32337 SEQ ID NO:283 

MTFMDAQASFQTQSRTIJ^VRGDLVDGFRRHELWLHLGWQDIKQRYRRSVLGPFWITIATGTTAVA 
MGGLYSKLFRLELSEHLPYVTLGLIVWNLINAAILDGAEVFVANEGLIKQLPAPLSVHVYRLVWRQMIF 
FAHNMYFVIAIIFPKPWSWADLSFLPALALIFLNCVWVSLCFGILATRYRDIGPLLFSWQLLFFMTPII 
WNDETLRRQGAGRWSSIVELNPLLHYLDIVRAPLLGAHQELRHVV^VVLVLTVVGWMLAAFAM 

ARVPYWV 

>Rv3789 - TB.seq 4235371:4235733 MW:13378 SEQ ID NO:284 

MRFVVTGGLAGIVDFGLYWLYKVAGLQVDLSKAISFIVGTITAYLINRRWTFQAEPSTARFVAVMLLY 
GITF AVQVGLN HLCLALLHYRAWAI PVAFVI AQGTATVl N Fl VQRAVI FRI R 
>Rv3790 - TB.seq 4235776:4237158 MW:50164 SEQ ID NO:285 
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MLSVGATTTATRLTGWGRTAPSVANVLRTPDAEMIVKAVARVAESGGGRGA1ARGLGRSYGDNAQN 
GGGLVIDMTPLNTIHSIDADTKLVDIDAGVNLDQLMKAALPFGLVVVPVLPGTRQVTVGGAIACDIHGK 
NHHSAGSFGNHVRSMDLLTADGEIRHLTPTGEDAELFWATVGGNGLTGIIMRATIEMTPTSTAYFIAD 
GD\n"ASLDETIALHSDGSEARYTYSSAWFDAISAPPKLGRAAVSRGRLATVEQLPAKLRSEPLKFDAP 
5 Q LLT LPDWPNGU^KYTFGPIGEL\^KSGTYRGKVQNLTQFYHPLDMFGEWNRAYGPAGFLQYQ 
FVIPTEAVDEFKKIIGVIQASGHYSFLNVFKLFGPRNQAPLSFPIPGWNICVDFPIKDGLGKFVSELDRR 
VLEFGGRLYTAKDSRTTAETFHAMYPRVDEWISVRRKVDPLRVFASDMARRLELL 
>Rv3791 - TB.seq 4237162:4237923 MW:27470 SEQ ID NO:286 

IVM.DAVGNPQTVLLLGGTSEIGLAICERYLHNSAARIVLACLPDDPRREDAAAAMKQAGARSVELIDF 
0 DALDTDSHPKMIEAAFSGGDVDVAIVAFGLLGDAEELWQNQRKAVQIAEINYTAAVSVGVLLAEKMR 
AQGFGQIIAMSSAAGERVRRANFWGSTKAGLDGFYLGLSEALREYGVRVLVIRPGQVRTRMSAHLK 
EAPLTVDKEYVANLAVTASAKGKELVWAPAAFRYVMMVLRHIPRSIFRKLPI 
>Rv3794 embATB.seq 4243230:4246511 MW:115694 SEQ ID NO:287 

VPHDGNERSHRIARLAAWSGIAGLLLCGIVPLLPVNQTTATIFWPQGSTADGNITQITAPLVSGAPRA 
IS LDISIPCSAIATLPANGGLVLSTLPAGGVDTGKAGLFVP^NQDTVWAFRDSVAAVAARSTIAAGGCS 
ALHIWADTGGAGADFMGIPGGAGTLPPEKKPQVGGIFTDU<VGAQPGLSARVDIDTRFITTPGALKKA 

VMLi.GVLAVLVAMVGLAALDRLSRGRTLRDWLTRYRPR 
DDGYLLWARVAPKAGWANYYRYFGTTHAPFDWYTSVLAQLAAVSTAGVW^ 

SRFVLRRLGPGPGGLASNRVAVFTAGAVFLSAVVLPFNNGLRPEPLIALGVLVTV\M_VERSIALGR1_AP 
20 AAVAIIVATLTATU^QGLIAIJVPLLTGARAIAQRIRRRRATDGLLAPI^VLAAALSLITVWFRDQTI^W 
AESARIKYKVGPTIAVVYQDFLRYYFLTVESNVEGSMSRRFAVLVLLFCLFGVLFVLLRRGRVAGLASG 
PAWRLIGTTAVGLLLLTFTPTKWAVQFGAFAGl^G\n.GA\n-AFTFARIGLHSRRNLTLYVTALLFVLA 
WATSGINGWFWGNYGWWYDIQPVIASHPVTSMFLTLSILTGLLAAVVYHFRMDYAGHTEVKDNRR 
NRILASTPLLWAVIMVAGEVGSMAKAAVFRYPLYTTAKANLTALSTGLSSCAMADDVLAEPDPNAGM 
25 LQPVPGQAFGPDGPLGGISPVGFKPEGVGEDLKSDPWSKPGLVNSDASPNKPNAAITDSAGTAGG 
KGPVGINGSHAALPFGLDPARTPVMGSYGENNLAATATSAWYQLPPRSPDRPLWVSAAGAIWSYK 
EDGDFIYGQSLKLQWGVTGPDGRIQPLGQVFPIDIGPQPAWRNLRFPLAWAPPEADVARIVAYDPNL 
SPEQWFAFTPPRVPVLESLQRLIGSATPVLMDIATAANFPCQRPFSEHLGIAELPQYRILPDHKQTAA 
SSNLWQSSSTGGPFLFTC^LRTSTIATYLRGDVVYRDWGSVEQYHRLVPADQAPDAWEEGVITVP 

30 GWGRPGPIRALP 

>Rv3795 embB TB.seq 4246511:4249804 MW:1 18023 SEQ ID NO:288 

MTQCASRRKSTPNPJ^LGAFASARGTRWVATIAGLIGFVLSVATPLLPWQTTAMLDWPQRGQLGSV 
TAPLISLTPVDFTATVPCDWRAMPPAGGWLGTAPKQGKDANLQALFVWSAQRVDVTDRNWILS 
VPREQVTSPQCQRIEVTSTHAGTFANFVGLKDPSGAPLRSGFPDPNLRPQIVGVFTDLTGPAPPGLA 
35 vsATIDTRFSTRPTTLKLLAIIGAIVATVVALIALWRLDQLDGRGSIAQLLLRPFRPASSPGGMRRLIPAS 
WRTFTLTDAWIFGFLLWHVIGANSSDDGYILGMARVADHAGYMSNYFRWFGSPEDPFGVVYYNLLA 
LMTHVSDASLWMRLPDLAAGLVCWLLLSREVLPRLGPAVEASKPAYWAAAMVLLTAWMPFNNGLR 
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PEGIIALGSLVTWUERSMRYSRLTPAALAVVTAAFTLGVQPTGLIAVAALVAGGRPMLRILVRRHRLV 
GTLPL VSPMLAAGT^LTWFADQTLSTVLEATRVRAK.GPSC^VmENLR 

FLITALCLFTAVFIMLRRKR1PSVARGPAWRLMGVIFGTMFFLMFTPTKWVHHFGLFAAVGAAMAALT 

TV/LVSPSVLRWSRNRMAFLAALFFLLALCWATTNGWWWSSYGVPFNSAMPKIDGIWSTIFFALFAI 

AAGYAAWLHFAPRGAGEGRLIRALTTAPVPIVAGFMAAVFVASMVAGIVRQYPTYSNGWSNVRAFV 

GGCGLADDVLVEPDTNAGFMKPLDGDSGSWGPLGPLGGVNPVGFTPNGVPEHTVAEAIVMKPNQP 

GTDYDWDAPTKLTSPGINGSTWLPYGLDPARVPI^GTYTTGAQQQSTLVSAWYLLPKPDDGHPLV 

WTAAGKIAGNSVLHGYTPGQTVVLEYAMPGPGALVPAGRMVPDDLYGEQPKAWRNLRFARAKMP 

ADAVAVRWAEDLSLTPEDWIAVTPPRVPDLRSLQEYVGSTQPVLLDWAVGLAFPCQQPMLHANGIA 

EIPKFRITPDYSAKKLDTDTWEDGTNGGLLGITDLLLRAHVMATYLSRDWARDWGSLRKFDTLVDAP 

PAQLELGTATRSGLWSPGKIR1GP 

>RV3834C serS seryl-tRNA synthase TB.seq 4307655:4308911 MW:45293 SEQ ID NO:289 

V.DLKLLRENPDAVRRSQLSRGEDPALVDALLTADAARRAV.STADSLRAEQKAASKSVGGASPEERP 

PLLRRAKELAEQVKAAEADEVEAEAAFTAAHI^SNVIVDGVPAGGEDDYAVLDWGEPSYLENPKD 

HLELGESLGUDMQRGAKVSGSRFYFLTGRGALLQLGLLQLALKLAVDNGi^PTIPP^VRPEV^GT 

GFLGAHAEEVYRVEGDGLYLVGTSEVPLAGYHSGEILDLSRGPLRYAGWSSCFRREAGSHGKDTRG 

IIRVHQFDKVEGFWCTPADAEHEHERLLGWQRQMLARIEVPYRVIDVAAGDLGSSAARKFDCEAWI 

PTQGAYRELTSTSNCTTFQARRLATRYRDASGKPQIAATLNGTLATTRWLVA.LENHQRPDGSVRVP 

DALVPFVGVEVLEPVA 

>Rv3907c pcnA po.ynucleotide polymerase TB.seq 4391631:4393070 MW:53057 SEQ ID NO:290 

VPEAVQEADLLTAAAVALNRHAALLRELGSVFAAAGHELYLVGGSVRDALLGRLSPDLDFTTDARPE 

RVQEIVRPWADAVWDTGIEFGTVGVGKSDHRME.TTFRADSYDRVSRHPEVRFGDCLEGDLVRRDF 

TTNAMAVRVTATGPGEFLDPLGGLAALRAKVLDTPAAPSGSFGDDPLRMLRAARFVSQLGFAVAPR 

N^EEMAPQU^ISAERVAAELDKLLVGEDPAAGIDLMVQSGMGAWLPEIGGMRMAIDEHHQHK 

DWQHSLTVLRQAIALEDDGPDLVLRWAALLHDIGKPATRRHEPDGGVSFHHHEWGAKMVRKRMR 

ALKYSKQMIDDISQLWLHLRFHGYGDGKWTDSAVRRYVTDAGALLPRLHKLVRA^TTRNKRRAAR 

LQASYDRLEERIAELAAQEDLDRVRPDLDGNQIMAVLDIPAGPQVGEAWRYLKELRLERGPLSTEEA 

TTELLSWWKSRGNR 

A number of embodiments of the invention have been described. Neverthe- 
less, it will be understood that various modifications may be made without departing from 
the spirit and scope of the invention. Accordingly, other embodiments are within the scope 
of the following claims. 
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WHAT IS CLAIMED IS: 

1 A method for identifying a nucleic acid or a polypeptide sequence that 
may be a target for a drug comprising the following steps: 

(a) providing a first nucleic acid or a polypeptide sequence that is known to 
be a drug target; 

(b) providing at least one algorithm selected from the group consisting of a "domain 
fusion" method, a "phylogenetic profile" method and a "physiologic linkage" method, 
wherein the algorithm is capable analyzing a functional relationship between nucleic acid or 
polypeptide sequences; and 

(c) comparing the first nucleic acid or the polypeptide drug target sequence to a 
plurality of sequences using at least one of the algorithms as set forth in step (b) to identify a 
second sequence that has a functional relationship to the first sequence, thereby identifying a 
nucleic acid or a polypeptide sequence that may be a target for a drug . 

2. A method for identifying a nucleic acid or a polypeptide sequence that 
may be essential for the growth or viability of an organism comprising the following steps: 

(a) providing a first nucleic acid or a polypeptide sequence that is known to 
be essential for the growth or viability of an organism; 

(b) providing at least one algorithm capable analyzing a functional relationship 
20 between nucleic acid or polypeptide sequences selected from the group consisting of a 

"domain fusion" method, a "phylogenetic profile" method and a "physiologic linkage" 
method; and 

(c) comparing the first nucleic acid or the polypeptide sequence to a plurality of 
sequences using at least one of the algorithms as set forth in step (b) to identify a second 

25 sequence that has a functional relationship to the first sequence, thereby identifying a nucleic 
acid or a polypeptide sequence that may be essential for the growth or viability of an 
organism. 

3. The method of claim 1 or claim 2, wherein the drug is an anti- 
30 microbial drug. 
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4. The method of claim 1 or claim 2, wherein the first nucleic acid or a 
polypeptide sequence is derived from a pathogen. 

5. The method of claim 4, wherein the pathogen is a microorganism. 

6. The method of claim 1 or claim 2, wherein the microorganism is 
Mycobacterium tuberculosis (MTB). 

7. The method of claim 1 or claim 2, wherein the plurality of sequences 
used to identify a second sequence comprises a database of the gene sequences of an entire 
genome of an organism. 



8. The method of claim 1 or claim 2, wherein the plurality of sequences 
used to identify a second sequence comprises a database of the gene sequences derived from 

15 a pathogen. 

9. The method of claim 1 or claim 2, wherein the "phylogenetic profile" 

method algorithm comprises 

(a) obtaining data, comprising a list of proteins from at least two genomes; 

(b) comparing the list of proteins to form a protein phylogenetic profile for 
each protein, wherein the protein phylogenetic profile indicates the presence or absence of a 
protein belonging to a particular protein family in each of the at least two genomes based on 

homology of the proteins; and 

(c) grouping the list of proteins based on similar profiles, wherein proteins 
with similar profiles are indicated to have a functional relationship. 



25 



30 



10. The method of claim 9, wherein the phylogenetic profile is in the form 
of a vector, matrix or phylogenetic tree. 

11. The method of claim 9, comprising determining the significance of 
homology between the proteins by computing a probability (p) value threshold. 
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12. The method of claim 1 1 , wherein the probability is set with respect to 
the value 1/NM, based on the total number of sequence comparisons that are to be 
performed, wherein N is the number of proteins in the first organism's genome and M in all 

5 other genomes. 

13. The method of claim 9, wherein the presence or absence is by 
calculating an evolutionary distance. 

1 0 14. The method of claim 1 3 , wherein the evolutionary distance is 

calculated by: 

(a) aligning two sequences from the list of proteins; 

(b) determining an evolution probability process by constructing a conditional 
probability matrix: p(aa-*aa'), where aa and aa' are any amino acids, said conditional 

15 probability matrix being constructed by converting an amino acid substitution matrix from a 
log odds matrix to said conditional probability matrix; 

(c) accounting for an observed alignment of the constructed conditional 
probability matrix by taking the product of the conditional probabilities for each aligned pair 
during the alignment of the two sequences, represented by P(p)=YlP( aa " ~* aa ^ ; ^ 

20 (d) determining an evolutionary distance a from powers equation 

p ,= =p a (aa— *aa')> maximizing for P. 

1 5 . The method of claim 1 4, wherein the conditional probability matrix is 
defined by a Markov process with substitution rates, over a fixed time interval. 



25 



16. The method of claim 1 4, where the conversion from an amino acid 
substitution matrix to a conditional probability matrix is represented by: 

BLOSUM62ij 
P& -*J) =P(/)2 A y , 
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where BLOSUM62 is an amino acid substitution matrix, and P(i-»j) is the 
probability that amino acid i is replaced by amino acid/ through point mutations according to 
BLOSUM62 scores. 

17. The method of claim 16, where Pfs are the abundances of amino acid 
j and are computed by solving a plurality of linear equations given by the normalization 
condition that: 

18. The method of claim 1 or claim 2, wherein the "physiologic linkage" 
method algorithm identifies proteins and nucleic acids that participate in a common 
functional pathway. 



1 9. The method of claim 1 or claim 2, wherein the "physiologic linkage" 
method algorithm comprises identifies proteins and nucleic acids that participate in the 

15 synthesis of a common structural complex. 

20. The method of claim 1 or claim 2, wherein the "physiologic linkage* 
method algorithm comprises identifies proteins and nucleic acids that participate in a 
common metabolic pathway. 



21 . The method of claim 1 or claim 2, wherein the "domain fusion" 

method algorithm comprises 

(a) aligning a first primary amino acid sequence of multiple distinct non-homologous 
polypeptides to second primary amino acid sequence of a plurality of proteins; and 
25 (b) for any alignment found between the first primary amino acid sequences of all of 

such multiple distinct non-homologous polypeptides and at least one protein of the second 
primary amino acid sequences, outputting an indication identifying the aligned second 
primary amino acid sequence as an indication of a functional link between the aligned first 
and second polypeptide sequences. 
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22. The method of claim 2 1 , wherein the aligning is performed by an 
algorithm selected from the group consisting of a Smith-Waterman algorithm, Needleman- 
Wunsch algorithm, a BLAST algorithm, a FASTA algorithm, and a PSI-BLAST algorithm. 

23 . The method of claim 2 1 , wherein the multiple distinct non- 
homologous polypeptides are obtained by translating a nucleic acid sequence from a genome 
database. 

24. The method of claim 2 1 , wherein the plurality of proteins have a 
10 known function. 

25. The method of claim 2 1 , wherein at least one of the multiple distinct 
non-homologous polypeptides has a known function. 



15 
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26. The method of claim 2 1 , wherein at least one of the multiple distinct 
non-homologous polypeptides has an unknown function. 



27. The method of claim 2 1 , wherein the alignment is based on the degree 
of homology of the multiple distinct non-homologous polypeptides to the plurality of 

20 proteins. 

28 . The method of claim 2 1 , further comprising deter minin g the 
significance of the aligned and identified second primary amino acid sequence by computing 
a probability (p) value threshold. 

29. The method of claim 28, wherein the probability threshold is set with 
respect to the value 1/NM, based on the total number of sequence comparisons that are to be 
performed, wherein N is the number of proteins in a first organism's genome and M in all 
other genomes. 

30. The method of claim 21, further comprising filtering excessive 

functional links between one first primary amino acid sequence of multiple distinct non- 
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homologous polypeptides and an excessive number of other distinct non-homologous 
polypeptides for any alignment found between the first primary amino acid sequences of the 
distinct non-homologous polypeptides and at least one of the second primary amino acid 
sequences of the plurality of proteins. 

5 

31. A computer program product, stored on a computer-readable medium, 
for identifying a nucleic acid or a polypeptide sequence that may be a target for a drug, the 
computer program product comprising instructions for causing a computer system to be 
capable of: 

10 (a) inputting a first nucleic acid or a polypeptide sequence that is known to be 

a drug target; 

(b) accessing at least one algorithm capable analyzing a functional relationship 
between nucleic acid or polypeptide sequences selected from the group consisting of a 
"domain fusion" method, a "phylogenetic profile" method and a "physiologic linkage" 
15 method; and 

(c) comparing the first nucleic acid or the polypeptide drug target sequence to 
a plurality of sequences using at least one of the algorithms set forth in step (b) to identify a 
second sequence that has a functional relationship to the first sequence and generating an 
output identifying a nucleic acid or a polypeptide sequence that may be a target for a drug . 

20 

32. A computer program product, stored on a computer-readable medium, 
for identifying a nucleic acid or a polypeptide sequence that may be essential for the growth 
or viability of an organism, the computer program product comprising instructions for 
causing a computer system to be capable of: 

25 (a) providing a first nucleic acid or a polypeptide sequence that is known to 

be essential for the growth or viability of an organism; 

(b) accessing at least one algorithm capable analyzing a functional relationship 
between nucleic acid or polypeptide sequences selected from the group consisting of a 
"domain fusion" method, a "phylogenetic profile" method and a "physiologic linkage" 

30 method; and 
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(c) comparing the first nucleic acid or the polypeptide sequence to a plurality of 
sequences using at least one of the algorithms set forth in step (b) to identify a second 
sequence that has a functional relationship to the first sequence and generating an output 
identifying a nucleic acid or a polypeptide sequence that may be essential for the growth or 
viability of an organism. 

33 . A computer system, comprising: 

(a) a processor; and 

(b) a computer program product as set forth in claim 3 1 or claim 32. 



10 
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Figure 1 
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Figure 4 
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Figure 5 
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